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Abstract 

Collateral  damage  presents  a  significant  risk  during  air  drops  and  airstrikes, 
risking  citizens’  lives  and  property,  straining  the  relationship  between  the  United 
States  Air  Force  and  host  nations.  This  dissertation  presents  a  methodology  to 
determine  the  optimal  location  for  making  supply  airdrops  in  order  to  minimize  col¬ 
lateral  damage  while  maintaining  a  high  likelihood  of  successful  recovery.  A  series  of 
non-linear  optimization  algorithms  are  presented  along  with  their  relative  success  in 
finding  the  optimal  location  in  the  airdrop  problem.  Additionally,  we  present  a  quick 
algorithm  for  accurately  creating  the  Pareto  frontier  in  the  multi- objective  airstrike 
problem.  We  demonstrate  the  effect  of  differing  guidelines,  damage  functions,  and 
weapon  employment  selection  which  significantly  alter  the  location  of  the  optimal 
aimpoint  in  this  targeting  problem.  Finally,  we  have  provided  a  framework  for  mak¬ 
ing  policy  decisions  in  fast-moving  troops-in-contact  situations  where  observers  are 
unsure  of  the  nature  of  possible  enemy  forces  in  both  finite  and  infinite  time  horizon 
problems.  Through  a  recursive  technique  of  solving  this  Markov  decision  process  we 
have  demonstrated  the  effect  of  improved  intelligence  and  differing  weights  in  the 
face  of  uncertain  situations. 
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I.  Introduction 

1.1  Motivation 

Even  as  advances  in  weapons  and  intelligence  gathering  improve  U.S.  mili¬ 
tary  capabilites,  civilian  casualties  and  collateral  damage  continue  to  hurt  the  U.S. 
mission  in  the  Middle  East.  According  to  sources  [98]  [45],  over  6,000  Afghan  civil¬ 
ians  deaths  can  be  attributed  directly  to  U.S.  and  NATO  military  actions  since  the 
inception  of  the  Afghanistan  campaign  in  2001. 

Specific  to  the  USAF,  in  2006,  116  Afghan  civilians  were  killed  as  a  result 
of  13  separate  OEF  and  ISAF  bombing  missions.  In  2007,  those  numbers  grew  to 
321  civilians  in  22  bombings  [38].  Aerial  bombardment,  which  has  long  been  the 
centerpiece  of  the  U.S.  strategic  plan  in  Afghanistan,  has  had  a  devastating  impact 
on  Afghan  civilians  [55] .  Some  [90]  argue  that  civilian  casualties  caused  by  American 
troops  and  American  bombs  have  made  the  case  for  the  insurgency. 

The  issue  of  civilian  casualties  has  become  a  focal  point  of  strategic  planning 
for  both  NATO  and  the  insurgency  forces  in  Afghanistan.  Civilian  casualties  often 
are  the  result  of  insurgents  hiding  among  civilians  or  using  the  civilians  as  human 
shields,  since  they  know  American  forces  are  hesitant  to  strike  buildings  in  which 
they  believe  civilians  are  located  [63];  however,  that  does  not  stop  insurgents  from 
using  these  incidents  as  rallying  cries  to  coerce  the  Afghan  populace.  “The  Taliban 
and  A1  Qaeda  grasp  the  value  of  presenting  themselves  as  defenders  of  the  Afghan 
people.  They  distribute  pamphlets  in  which  they  revile  American  and  NATO  soldiers 
as  infidel,  terrorist  forces  of  occupation.  When  those  same  forces  send  planes  to  bomb 
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mosques  and  religious  schools,  killing  Afghan  children,  the  Taliban  do  not  hesitate 
to  seize  on  the  tragedy  as  proof  of  the  validity  of  their  propaganda  -  even  if  merciless 
A1  Qaeda  interlopers  prevented  those  children  from  escaping  the  bombs  [96].” 

In  November  2008,  General  David  McKiernan,  commander  of  U.  S.  Forces 
Afghanistan  “ordered  a  tightening  of  procedures  for  launching  airstrikes”  while  stat¬ 
ing  that  “minimizing  civilian  casualties  is  crucial  [103].”  In  June  2009,  as  his  rela¬ 
tionship  with  Defense  Secretary  Gates  became  strained  due  to  the  continued  civilian 
casualties  [97],  he  was  asked  to  resign.  His  replacement,  General  Stanley  McChrystal, 
in  one  of  his  first  interviews  upon  taking  the  role  stated,  “A  willingness  to  operate 
in  ways  minimizing  casualties  or  damage  is  critical.  The  measure  of  success  will  not 
be  enemy  killed.  It  will  be  shielding  the  Afghan  population  from  violence  [1].”  U.S. 
commanders  have  even  gone  so  far  as  requiring  troops  to  withdraw  when  possible 
rather  than  get  into  a  protracted  hrehght  that  result  in  civilian  casualties  [39] . 

The  issue  of  civilian  casualties  is  not  new  to  the  U.S.  military  during  the  Global 
War  on  Terrorism  (GWOT),  but  with  increasing  numbers  of  news  outlets,  social 
media  forums,  and  personal  communication  devices,  any  mistake  can  be  immediately 
consumed  by  people  around  the  world,  even  before  the  facts  of  the  scenario  are 
fully  known.  Studies  on  collateral  damage  estimation  from  nuclear  weapons  were 
performed  after  World  War  II  [88].  Kiernan  and  Owen  [55]  discuss  the  similarities 
between  GWOT  and  Cambodian  civilian  casualties  during  the  Vietnam  War.  Keaney 
[54]  speaks  of  the  intelligence  issues  concerning  collateral  damage  from  the  Gulf 
War.  Infamously,  during  the  NATO  campaign  in  Kosovo  in  2000,  the  U.S.  military 
mistakenly  bombed  the  Chinese  embassy  in  Belgrade  when  believing  the  building  to 
be  a  headquarters  for  the  Yugoslav  Army  [75].  Similar  situations  in  Yugoslavia  [7]  [36] 
had  resulted  in  buildings  which  have  little-to-no  military  value  being  destroyed. 

Even  with  the  most  modern  military  technology  and  decades  of  war-time  expe¬ 
rience,  civilian  casualties  continue  to  plague  U.S.  forces  and  undermine  the  mission 
they  seek  to  accomplish  in  the  Global  War  on  Terrorism. 
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1.2  Collateral  Damage 

The  three  papers  presented  in  this  dissertation  explore  the  three  categories  of 
collateral  damage  and  techniques  or  tools  within  each  category  to  lower  the  amount 
of  collateral  risk  while  still  achieving  mission  success.  Chapter  2  develops  a  tool  for 
minimizing  collateral  risk  from  supply  airdrops  based  on  airdrop  dynamics.  Chap¬ 
ter  3  provides  a  framework  for  understanding  the  trade-off  between  lethality  on  a 
military  target  and  the  risk  to  collateral  objects  for  pre-planned  airstrike  missions. 
Chapter  4  develops  guidelines  for  lowering  civilian  casualties  in  fast-moving  troops- 
in-contact  scenarios  where  limited  intelligence  often  yields  poor  decision-making. 

1.2.1  Airdrop  Collateral  Damage.  When  typically  thought  of  and  reported 
on,  collateral  damage  applies  to  weapons  fired  near  civilians  and  civilian  buildings. 
However,  collateral  damage  also  results  from  supply  airdrops  near  populated  areas. 
These  airdrops  often  need  to  occur  near  populated  areas  due  to  safety  and  logistic 
concerns.  The  bundles,  often  weighing  thousands  of  pounds  and  traveling  at  speeds 
up  to  20  miles  per  hour,  become  dangerous  projectiles  when  falling  near  the  civilian 
populations.  Buildings  can  be  damaged  and,  in  extreme  cases,  people  have  been 
killed  [47], 

Improvements  in  technology  for  airdrop  platforms  have  vastly  improved  the 
potential  accuracy  of  airdrop  missions  [5]  [69].  However,  the  majority  of  airdrops 
are  still  performed  by  “dumb”  techniques,  where  the  bundles  are  not  guided  to  the 
ground,  but  rather  fall  freely  once  exiting  the  back  of  the  aircraft.  These  airdrop 
missions  can  often  yield  unrecoverable  bundles  when  they  fall  in  places  where  recov¬ 
ery  is  either  impossible,  such  as  in  a  lake  or  on  a  mountainside,  or  where  recovery 
is  dangerous,  such  as  when  bundles  land  miles  from  an  operating  base  in  hostile 
territory. 

The  official  Airdrop  Collateral  Damage  Estimation  Methodology  [100]  notes 
the  art  and  science  of  airdrops,  when  put  together  with  sound  judgment  and  opera- 


3 


tional  considerations  yield  a  successful  drop.  The  guidance  notes  that  probabilities, 
empirical  data,  and  historical  observations  all  should  be  considered  in  the  planning 
stages  of  an  airdrop  [99] .  The  guidance  instructs  planners  to  ask  themselves  five  ques¬ 
tions  regarding  collateral  damage  during  development  and  execution  of  an  airdrop 
plan: 

•  Are  there  collateral  objects  within  the  collateral  hazard  area  of  the  intended 
airdrop  target? 

•  Can  the  functionality  of  the  collateral  objects  be  characterized? 

•  Can  collateral  concerns  be  mitigated  by  utilizing  different  parachutes/delivery 
methods  while  still  achieving  the  desired  effect? 

•  Are  there  civilians  at  risk  by  the  airdrop? 

•  Is  the  collateral  risk  of  the  airdrop  excessive  in  comparison  to  the  expected 
advantage  gained  by  the  airdrop? 

The  collateral  damage  methodology  presented  in  [100]  develops  an  understand¬ 
ing  of  airdrop  dispersion,  incidental  consequences  (collateral  risk),  and  mitigation 
techniques,  but  it  is  quick  to  point  out  that  the  methodology  is  not  an  exact  sci¬ 
ence.  The  Collateral  Damage  Weighted  Risk  Assessment  Tool  (CDWRAT)  presented 
in  [100]  uses  a  simple  formula  based  on  the  size  and  location  of  collateral  objects 
and  the  circular  error  probable  (CE90).  The  equation  shown  below  yields  an  overall 
percentage  of  collateral  risk  within  the  CE90. 

Aw  =  (Rk™  ~  R~  +  1)  x  Am  (1) 

long 

where 

Riong  -  CE90  semi-major  radius 

Rco  -  collateral  object  radius 

Aw  -  weighted  collateral  object  area 
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Aco  -  original  collateral  object  area 


Figure  1:  Current  Collateral  Damage  Estimated  Weighted  Risk  [99] 

This  calculation  is  performed  for  each  collateral  object  and  the  results  are 
aggregated: 


total  collateral  risk 


total  CE90  area 


(2) 


While  this  equation  presents  a  good  start  in  estimation  of  collateral  risk  from 
airdrops,  it  does  neglect  a  number  of  factors  which  affect  proper  estimation  of  col¬ 
lateral  risk.  Cammarano  [23]  improves  this  equation  by  hireling  the  true  distribution 
of  airdrops,  rather  than  simply  the  CE90.  Based  on  operational  drops,  he  argues 
that  airdrops  can  be  estimated  using  the  bivariate  normal  distribution.  Further,  his 
estimation  tool  uses  the  bivariate  normal  distribution  with  the  standard  deviations 
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in  the  x-  and  y-directions  along  with  zero  correlation  between  the  x-  and  y-rniss 
distances  (i.e.  p  =  0). 

Further,  Cammarano  provides  for  weights  being  given  to  the  collateral  objects 
to  more  accurately  approximate  real-world  considerations.  In  the  CDWRAT,  the 
example  provided  treats  bodies  of  water  and  buildings  of  all  types  and  purposes  as 
the  same,  with  only  the  size  and  the  distances  to  the  center  of  the  object  taken 
into  consideration.  Cammarano’s  tool  gives  the  planner  the  ability  to  say  that  while 
landing  a  bundle  into  a  lake  and  into  a  occupied  building  are  both  undesirable,  at 
least  in  the  case  of  the  lake,  no  one  is  injured,  thus  a  higher  weighting  can  be  placed 
on  the  building.  While  Cammarano’s  work  doesn’t  (typically)  yield  a  percentage  of 
collateral  risk,  it  gives  a  much  more  meaningful  statistic  (overall  collateral  damage 
expected)  to  the  decision-maker. 

Cammarano  also  allows  for  buildings  of  differing  shapes,  where  the  CDWRAT 
only  requires  the  center  of  the  collateral  object;  this  grants  even  more  accuracy 
to  his  methodology.  Cammarano  also  incorporates  multiple  bundle  drops  from  the 
same  mission,  where  each  bundle  has  its  own  desired  point  of  impact.  Finally,  Cam¬ 
marano’s  tool  gives  a  report  for  each  collateral  object  providing  accurate  information 
to  the  decision-maker  who  might  not  want  to  assign  weights  to  each  of  the  collateral 
objects. 

The  second  chapter  of  this  dissertation  leverages  off  of  Cammarano’s  work. 
Once  we  are  able  to  understand  the  nature  of  airdrops  and  estimate  the  collateral 
risk  based  upon  relatively  few  inputs,  the  next  question  becomes:  How  do  we  then 
minimize  the  expected  amount  of  collateral  risk  for  an  airdrop? 

1.2.2  Pre-Planned  Airstrike  Collateral  Damage.  Pre-planned  airstrikes  are 
required  to  have  a  collateral  damage  estimation  (CDE)  done  prior  to  engagement. 
The  overall  purposes  of  this  CDE  is  to  lower  the  amount  of  collateral  damage  result¬ 
ing  from  the  strike  and  to  make  the  decision-maker  fully  aware  of  the  collateral  risks 
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prior  to  making  the  decision  to  strike.  The  U.S.  military  has  developed  software 
which  visually  describes  collateral  risk  in  the  area  of  the  blast  [43]. 

Much  of  the  literature  pertains  to  the  blast  effect  of  weapons  on  buildings, 
structures  and  people.  Ngo  et  al.  [77]  and  Mays  and  Smith  [67]  discuss  blast 
effects  on  buildings,  while  Mills  [71]  and  Newmark  and  Hansen  [76]  and  a  U.S. 
Army  [101]  report  concern  themselves  with  structural  design  to  resist  collateral  dam¬ 
age.  Humphrey  et  al.  [49]  discuss  the  effects  of  weapons  on  people  within  the  blast 
and  fragmentation  radius. 

Damage  functions  to  model  effects  within  the  blast  radius  have  been  devel¬ 
oped  to  accurately  represent  collateral  risk  in  an  airstrike.  Driels  [34],  in  his  work 
on  weaponeering,  provides  damage  functions,  and  estimates  of  lethality  on  military 
targets.  Douglass  [33]  presents  a  method  for  estimating  collateral  damage  in  urban 
environments  based  on  the  proximity  of  the  friendly  forces  to  the  enemy  combatants, 
based  on  size  of  weapons,  likelihood  of  false  alarm,  blast  radius,  and  circular  error 
probable  of  weapon  used.  David  [29]  gave  estimates  of  the  safe  distances  in  com¬ 
bat  scenarios  for  friendly  forces  based  on  circular  error  radius,  the  lethal  ranges  of 
weapons,  and  the  damage  functions  of  these  weapons.  Lucas  [64]  tacked  onto  David’s 
work  by  looking  at  the  limiting  behavior  of  damage  functions  relative  to  one  another, 
focusing  primarily  on  four  damage  functions:  the  lognormal,  exponential,  Gaussian, 
and  cookie-cutter.  Przemieniecki  [82]  discusses  aspects  of  damage  functions,  accu¬ 
racy  functions,  and  collateral  risk,  as  well  as  their  effect  on  optimal  aiming  locations. 
Binninger  [14]  presents  a  lognormal  damage  function  for  use  in  predicting  the  effect 
of  a  nuclear  weapon  and  by  offsetting  the  aimpoint  of  the  weapon  can  estimate  the 
percentage  of  buildings  destroyed  at  a  given  distance  from  the  center  of  a  town. 

There  has  been  minimal  work  in  viewing  the  airstrike  problem  as  a  multi¬ 
criteria  decision  making  model,  most  notably  [57]  and  Brooks  et  al.  [20]  who  used 
agent-based  simulation  to  explore  the  trade-off  between  building  damage  and  mission 
effectiveness.  However,  little  has  been  done  to  develop  the  Pareto  frontier  within  this 
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framework  to  allow  decision-makers  the  optimal  aimpoints  and  employment  methods 
to  minimize  risk  while  maximizing  lethality  on  a  military  target. 

1.2.3  Troop s-in- Contact  Collateral  Damage.  The  majority  of  civilians 
killed  in  U.S.  airstrikes  died  when  Special  Forces  summoned  an  airstrike  to  sup¬ 
port  them  during  troops-in-contact  (TICs)  situations  [39].  A  TIC  situation  is  an 
unplanned  opportunity  strike  in  support  of  ground  forces  that  have  made  contact 
with  enemy  forces.  In  fact,  only  two  (of  35)  airstrikes  resulting  in  civilian  casualties 
in  2007-08  were  from  non-TIC  (pre-planned)  missions  [38].  These  rapid-response, 
fluid  strikes  are  characterized  by  (typically)  a  lack  of  prior  information  concerning 
the  nature  and  location  of  enemy  and  non-combatant  forces,  as  well  as  friendly  forces 
which  may  be  in  serious  harm. 

The  ground  forces  in  TIC  situations,  with  the  use  of  an  Air  Force  JTAC, 
will  request  air  support  in  order  to  strike  the  enemy,  or  at  least  provide  them  the 
opportunity  to  extricate  themselves  safely  from  the  scene.  The  fog  of  war  in  the  form 
of  limited  intelligence  is  the  reason  that  TICs  produce  so  many  casualties.  When 
friendly  force  lives  are  at  stake,  the  air  forces  must  act  quickly  and  decisively,  and 
the  consequence  of  these  actions  may  be  the  loss  of  Afghan  civilian  lives. 

TICs  scenarios  have  been  the  least  studied  of  the  three  collateral  hazards  pre¬ 
sented  in  this  dissertation,  yet  they  present  the  most  danger  to  civilians.  The  closest 
information  in  the  literature  comes  from  other  types  of  situations  where  quick  de¬ 
cisions  need  to  be  made  with  limited  information.  Kocher  et  al.  [56]  look  at  the 
effects  of  time  pressure  on  risky  decisions  and  how  pure  loss  and  pure  gain  decision 
models  affect  human  decision  making.  Decision-making  where  delaying  the  decision 
has  an  associated  cost  was  studied  by  Payne  et  al.  [79]  where  they  found  that  in 
some  cases  delaying  decision  making  results  in  a  lower  expected  return  even  when 
the  best  decision  is  ultimately  made. 


Polikar  [81]  looks  at  decision-making  where  intelligence  is  gathered  from  mul¬ 
tiple  participants  with  differing  perspectives,  ultimately  arguing  that  these  vary¬ 
ing  perspectives  yield  better  decisions.  More  germane  to  the  battlefield,  Phillips 
et  al.  [80]  seek  to  model  the  flow  of  information  in  combat  situations,  a  critical 
component  of  TICs.  They  present  an  information  processing  model  which  allows 
decision-makers  to  understand  the  best  intelligence  available. 

The  work  presented  here  seeks  to  provide  a  framework  for  the  issues  and  chal¬ 
lenges  which  TICs  present.  The  framework  is  a  rough  sketch  of  how  information  (the 
key  issue  in  TICs)  flows  in  fast-moving  scenarios.  Within  this  framework,  we  will 
seek  to  identify  optimal  decisions  and  optimal  times  to  make  these  decisions  during 
these  rapid-response  situations. 

1.3  Methodology  Literature  Review 

The  three  different  problems  presented  within  this  dissertation  yield  varying 
formulations  which  touch  on  a  variety  of  classes  of  operations  research  fields,  in¬ 
cluding  non-linear  programming,  global  search  techniques,  evolutionary  algorithms, 
multi-objective  optimization,  stochastic  programming,  and  Markov  decision  pro¬ 
cesses.  The  developed  formulations  and  solution  techniques  are  explored  in  the 
following  sections  to  give  a  background  on  which  the  three  papers  will  be  based. 

1.3.1  Non-Linear  Programming.  Bazaraa  et  al.  [11]  formulate  the  non¬ 
linear  minimization  program  as: 

min/(x)  (3) 

subject  to  Qi (x)  <  0  for  i  —  1, . . . ,  m 

/q(x)  =  0  for  i  =  1, ...  ,1 

x  G  X, 
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where  X  is  a  subset  of  R".  /  is  a  function  from  Rn  — y  [R  and  is  referred  to  as  the 
objective  function.  gj(x)  <  0  and  hfc(x)  =  0  are  the  constraints.  In  a  non-linear 
program,  the  constraints  and  objective  functions  can  be  non-linear  (whereas,  in  a 
linear  program,  all  constraint  and  objective  functions  are  linear). 

If  /(x)  <  /(x)  for  all  x  G  Rra,  then  x  is  the  global  minimum  for  the  function  / 
in  the  unconstrained  problem.  If  there  exists  a  neighborhood  7Ve(x)  around  x  where 
/(x)  <  /(x)  for  any  x  G  N^k  then  x  is  a  local  minimum  for  /  in  Rn. 

If  /  is  differentiable  at  x,  and  if  x  is  a  local  minimum,  then  V/(x)  =  0.  Con¬ 
versely,  if  /  is  differentiable  at  x  and  there  exists  a  vector  d  such  that  V/(x)*d  <  0, 
then  there  exists  a  5  >  0  such  that  /(x  +  Ad)  <  /(x)  for  each  A  G  (0,5)  (d  is  the 
descent  direction  of  /  at  x).  The  descent  direction  d  represents  a  direction  of  im¬ 
provement  in  an  optimization  problem  that  is  used  in  most  non-linear  programming 
optimization  algorithms,  such  as  response  surface  methodology  [74], 

Similarly,  in  the  constrained  problem,  if  /(x)  <  /(x)  for  all  x  G  S'  where  S  is 
the  feasible  region  for  the  problem,  then  x  is  the  global  minimum  and  if  there  exists 
a  neighborhood  7Ve(x)  around  x  where  /(x)  <  /(x)  for  any  x  G  iVex  then  x  is  a  local 
minimum  for  /  in  S.  With  S'  as  a  non-empty  set  in  IRn  and  x  G  S  then  the  cone  of 
feasible  directions  ( D )  of  S  at  x  is 

D  =  {d  :  d  7^  0,  and  x  +  Ad  G  S'  for  all  A  G  (0, 5)  for  some  5  >  0}. 

The  cone  of  improving  directions  F  at  x  of  /  is 

F  =  {d  :  /(x  +  Ad)  <  /(x)  for  all  A  G  (0,  5)  for  some  5  >  0}. 

If  /  is  differentiable  at  x  G  S'  then  x  is  a  local  optimum  only  if  F  fl  D  —  0.  That 
is,  there  exists  no  feasible,  improving  direction  for  /  at  x.  The  concepts  of  feasi¬ 
ble  and  improving  directions  are  integral  to  area  search  methods  over  continuous 
(particularly,  differentiable)  objective  functions. 
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While  Lasdon  [61]  presents  algorithms  and  heuristics  for  global  optimization  of 
large  systems,  Roy  et  al.  [86]  review  commercially  available  packages  for  spreadsheet 
optimization.  Achetti  and  Schoen  [6]  in  their  survey  of  global  optimization  tech¬ 
niques  lump  approaches  into  space-covering  techniques,  trajectory  techniques  (such 
as  response  surface  methodology)  and  random  search  techniques,  finding  that  each 
technique  has  its  merit  and  no  strategy  is  optimal  without  a  priori  information  on 
the  objective  function. 

1.3. 1.1  Random  Search  Techniques.  A  random  search  technique  is 
proposed  by  Solis  and  Wets  [93]  to  find  global  minima  in  optimization  problems 
expanding  on  the  work  of  Anderson  [4],  Rastrigin  [84],  and  Karnopp  [53].  Their 
work  is  particularly  useful  in  situations  where  function  characteristics  are  difficult  to 
compute,  when  the  response  function  is  “bumpy”,  when  processing  time  is  limited, 
and  when  it  is  highly  desirable  to  ford  a  global  minimum  among  a  large  number 
of  local  minima.  The  assumption  for  the  response  function  is  that  it  is  continuous, 
since  a  discontinuous  function  could  conceivably  have  a  minimum  at  a  discontinuous 
point,  which  would  be  (nearly)  impossible  to  find  without  an  exhaustive  search  of 
every  point  in  the  input  space  S.  Thus,  they  search  for  the  essential  inffinum  a  of 
/  on  S  which  is  defined  as  a  =  inf {t  :  v[x  G  S\f(x)  <  t]  >  0},  which  is  the  set 
of  points  that  yield  values  close  to  the  essential  inffinum  a  has  non-zero  v-measure, 
meaning  that  the  search  is  for  a  location  where  a  set  of  points  have  a  response  less 
than  t  and  this  set  also  must  have  an  interior  (consider  the  case  where  the  global 
minimum  is  at  a  discontinuous  point  x,  for  some  t  >  f(x)  and  r  >  0  there  exists  no 
neighborhood  B{xpr)  such  that  all  points  in  B(x;r)  have  a  response  value  less  than 
t). 

Importantly,  Solis  and  Wets  note  that  any  global  search  method  must  meet 
the  assumption  that  for  any  Borcl  subset  A  of  S  with  v(A)  >  0,  we  have  that 
—  hk(A)\  =  0.  In  essence,  this  means  that  any  subset  A  (with  volume)  of  the 
search  space  S  must  be  searched  to  guarantee  that  the  global  minimum  is  found. 
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The  Solis  and  Wets  algorithm  uses  normally-distributed  steps  to  generate  new 
points,  the  response  value  of  the  point  is  calculated  and  if  the  newly  generated  point 
has  a  higher  (worse)  objective  function  value,  then  steps  are  taken  from  the  initial 
point  in  the  opposite  direction  to  find  a  new  point.  If  both  of  these  new  points 
are  worse  than  the  original  point  then  a  new  starting  point  is  generated.  Hart  [44] 
notes  that  the  Solis  and  Wets  algorithm  lacks  definitive  stopping  criteria  that  yield 
optimality,  typically  relying  on  a  fixed  number  of  iterations.  Additionally,  Hart 
states,  “In  general,  methods  that  utilize  a  priori  information  about  a  problem  will 
outperform  general  purpose  methods  that  utilize  less  information  [44].” 

Niederreiter  [78]  presents  quasi-Monte  Carlo  methods  for  generating  a  sequence 
of  uniformly  distributed  random  points  spread  on  a  space.  Estimates,  using  the 
variance  of  these  random  points,  can  be  made  for  the  value  of  the  minima  over  the 
searched  area  and  local  search  methods  can  be  used  in  conjunction  with  these  quasi- 
Monte  Carlo  techniques;  however,  global  minimization  again  cannot  be  guaranteed 
on  an  objective  function  and  domain  without  a  priori  information. 

1.3. 1.2  Response  Surface  Methodology.  Anderson  [4]  discusses  exper¬ 
imental  design  and  response  surface  methods  to  find  input  parameters  for  optimal 
performance  in  an  nn characterized  experiment.  Brooks  [21]  compares  steepest  as¬ 
cent  and  univariate  iterative  methods  for  determining  the  optimal  settings  during 
experimentation,  finding  the  ascent  methods  superior  in  terms  of  accuracy  and  speed. 

A  basic  approach  for  approximating  response  functions  is  proposed  by  Myers 
and  Montgomery  [74]  with  y  =  /(£i,  £2,  •  •  ■ ,  6c)  +  e  where  /  is  the  true  response 
function,  which  is  either  unknown  or  complicated,  e  in  the  function  for  this  work 
will  represent  sources  of  variation  that  are  not  accounted  for  by  the  derived  model. 
£i,£2,  . . . ,  £*;  will  be  the  input  values  for  our  model;  in  the  airdrop  model  these  will 
typically  be  the  aimpoint  (x  and  y)  and  the  approach  angle. 
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Myers  and  Montgomery  discuss  further  the  sequential  nature  of  response  sur¬ 
face  methodology  whereby  initially  hypotheses  regarding  the  important  input  vari¬ 
ables  takes  place,  often  backed  up  with  a  screening  experiment.  The  screening  exper¬ 
iment  will  identify  the  variables  affecting  the  response  variable  and  which  variables’ 
effects  should  be  investigated  further.  After  the  screening  takes  place,  they  recom¬ 
mend  the  use  of  a  first-order  model  and  the  method  of  steepest  ascent,  whereby 
starting  from  an  initially  small  portion  (referred  to  by  Myers  and  Montgomery  as 
the  region  of  interest)  of  the  overall  search  space,  we  begin  to  move  in  the  direction 
of  the  optimal  combination  of  input  variables.  Iteratively  this  method  of  steepest 
ascent  is  performed  until  a  maximum  for  the  response  function  is  found  (once  the 
current  solution  can  no  longer  be  improved  in  the  local  area  (region  of  interest)). 
Critically,  it  should  be  noted  that  the  maximum  found  by  this  technique  is  simply 
a  local  maximum  for  the  response  function  and  is  not  guaranteed  to  be  a  global 
maximum.  Situations  arise  where  the  response  surface  will  be  “bumpy”  and  have 
many  local  maxima  throughout  the  total  search  space;  thus  while  the  techniques  of 
Myers  and  Montgomery  will  be  used,  they  must  be  extended  in  the  effort  to  find  a 
global  maximum. 

1.3.2  Evolutionary  Algorithms.  Hart  [44]  inspected  genetic  algorithms  in 
combination  with  local  search  algorithms  for  solving  global  optimization  problems. 
Michalewicz  and  Schoenauer  [70]  discuss  adapting  evolutionary  algorithms  to  con¬ 
strained  parameter  optimization  problems,  pointing  out  that  finding  a  general  algo¬ 
rithm  that  is  optimal  for  all  non-linear  programs  is  unrealistic.  The  existence  of  local 
(and  not  global)  optima  presents  the  primary  problem  in  non-linear  programs  with 
continuous  functions,  since  steepest  descent  algorithms  will  yield  only  local  (and  not 
necessarily  global)  optima.  Michalewicz  and  Schoenauer  break  down  evolutionary 
algorithms  into  mutation  operators,  such  as  [10],  [78]  and  [37],  and  crossover  opera¬ 
tors,  as  in  [87],  [35],  [30]  and  later  [94],  Mutation  operators  typically  use  Gaussian 
mutation  to  modify  components  of  a  solution  vector,  whereas  crossover  operators 
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use  multiple  parent  solution  vectors  to  develop  future  generations  of  solution  vec¬ 
tors.  In  both  cases,  the  algorithms  remove  less-optimal  solutions  in  each  proceeding 
generation.  Further,  [70]  views  the  constraint-handling  methods  as  falling  into  four 
categories:  feasibility  preserving,  penalty-based,  feasibility/infeasibility  separated, 
and  hybrid  methods. 

Storn  and  Price  [94]  developed  the  differential  evolution  technique  for  solv¬ 
ing  global  optimization  problems.  Their  algorithm,  discussed  in  much  more  detail 
in  the  following  chapter,  inspired  a  great  deal  of  effort  in  the  realm  of  optimiza¬ 
tion.  Lampinen  and  Zclinka  [59]  apply  the  differential  evolution  algorithm  to  mixed 
integer-discrete-continuous  problem  demonstrating  the  versatility  of  the  algorithm. 
Similar  to  what  [70]  presented  for  the  general  non-linear  programming  algorithm, 
Lampinen  [58]  presents  a  constraint-handling  approach  for  the  constrained  differen¬ 
tial  evolution  algorithm.  Huang  et  al.  [48]  demonstrate  a  self-adaptive  algorithm  for 
constrained  non-linear  problems  which  modifies  the  two  control  parameters  (F  and 
CR)  of  differential  evolution,  thus  cutting  out  the  need  for  exhaustive  parameter 
fine-tuning. 

1.3.3  Multi- Objective  Optimization.  Multi-objective  optimization  is  ap¬ 
plied  in  cases  where  there  is  more  than  one  objective  function. 

min y  =  /(x)  =  (/i(x),  /2(x), . . . ,  /„(x))  (4) 

subject  to  x  =  (xi,  X2-,  ■  ■ ■ ,  xm )  G  X 
y  =  (yi,  1/2,  •  •  • ,  Vn)  £  Y 

where  x  is  the  solution  vector,  X  is  the  solution  space,  y  is  the  objective  vector, 
and  Y  is  the  objective  space  [107].  With  more  than  one  objective  function,  there 
becomes  no  strict  ordering  in  the  objective  space  (unlike  in  a  single-objective  prob¬ 
lem).  Therefore,  the  Pareto  frontier  concept  is  implemented,  where  we  say  that  a 
solution  x  is  non-dominated  if  there  exists  no  other  x  such  that  /fc(x)  <  /7(x)  for 
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all  k  =  1, ...  ,n  and  /7(x)  <  /^(x)  for  some  k  G  1, _ ,  n.  The  collection  of  such 

non-dominated  (Pareto-optimal)  points  forms  the  Pareto  frontier  for  the  formulation. 

Techniques  to  solve  multi-objective  formulations  attempt  to  find  solutions  near 
the  Pareto  frontier.  However,  since  the  Pareto- frontier  is  not  a  single  point  (typically) 
nor  a  finite  collection  of  points  (typically  for  continuous  solution  spaces),  then  hireling 
solutions  near  the  true  Pareto  frontier  is  not  enough.  Multi-objective  algorithms  also 
seek  to  find  a  variety  of  solutions  that  describe  the  Pareto-frontier  more  completely. 
Ziztlcr  et  al.  [106]  provide  three  metrics  which  describe  the  performance  of  multi¬ 
objective  algorithms;  these  metrics  are  based  on  accuracy  of  the  solutions,  diversity 
of  solutions,  and  breadth  of  solutions  in  creating  the  Pareto-frontier. 

Evolutionary  algorithms  have  been  found  to  be  particularly  robust  in  develop¬ 
ing  the  Pareto  frontier  for  multi-objective  problems  due  to  their  ability  to  process 
a  set  of  solutions  in  parallel,  therefore  exploiting  similarities  of  solutions  by  recom¬ 
bination  [107].  Ziztler  and  Thiele  [107]  developed  a  strength  Pareto  evolutionary 
algorithm  approach  (SPEA)  which  stored  nondominated  solutions  externally  in  a 
second  group,  evaluated  solution  fitness  dependent  on  the  number  of  external  non¬ 
dominated  points  which  dominate  it,  and  clustered  the  nondominated  point  set  in 
order  to  lower  the  nondominated  set’s  population  without  losing  diversity. 

Babu  and  Jehan  [9]  implemented  the  work  of  Storn  and  Price  [94]  into  the 
multi-objective  realm  by  iteratively  expelling  dominated  solutions  for  each  genera¬ 
tion.  Xue  et  al.  [104]  present  another  differential  evolution  based  algorithm  (MODE) 
where  non-dominated  solutions  are  identified  at  every  generation  with  the  mutation 
step  being  different  for  non-dominated  and  dominated  points.  A  Pareto-frontier 
differential  evolution  (PDE)  is  presented  by  Abbass  et  al.  [2]  which  is  seen  to  im¬ 
prove  upon  the  SPEA  approach  of  [107],  with  their  approach  constantly  finding  new 
non-dominated  points  in  each  generation  and  then  removing  similar  ones  based  on 
a  distance  metric. 
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1.3.4  Dynamic  Programming.  Dynamic  programming  can  be  thought  of 
using  stages  and  states.  A  stage  is  a  discrete  point  in  time  at  which  a  decision  (v,k)  is 
made  based  on  the  state  (x/~)  of  the  system.  The  state  is  the  summary  of  all  decisions 
from  previous  stages  and  their  outcomes,  due  not  only  to  the  decisions  made  but 
also  the  randomness  (wk)  involved  with  moving  from  stage  to  stage.  Some  additive 
reward  is  gained  for  each  decision  made,  and  the  goal  is  to  maximize  the  sum  of  the 
rewards  over  the  time  horizon  of  the  problem. 

Bertsekas  [13]  lays  out  the  main  ingredients  of  a  basic  dynamic  programming 
formulation  as: 

1.  A  discrete-time  system  of  the  form  Xk+i  =  fk(%k,Uk,Wk) 

2.  Independent  random  parameters 

3.  A  control  constraint  (decision) 

4.  An  additive  cost  of  the  form  EgN(x n)  +  22k2o  9k(xk,Uk,Wk) 

5.  Optimization  over  policies  (rules  for  choosing  Uk  for  each  k  and  each  possible 
value  for  Xk). 

Denardo  [31]  formulates  the  dynamic  programming  problem  as: 

Xk+1  =  fk(%k,  Uk,  Wk),  k  =  0, 1, . . . ,  N  -  1  (5) 


where: 

k  indexes  discrete  time, 

Xk  is  the  state  of  the  system  and  summarizes  past  information  that  is  relevant  for 
future  optimization 

Uk  is  the  control  or  decision  variable  to  be  selected  at  time  k 
Wk  is  a  random  parameter 
N  is  the  number  of  time  control  is  applied. 

For  the  LLS  problem,  we  assume  that  each  stage  brings  about  more  data  about 
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the  problem  assuming  the  pilot  continues  to  loiter.  The  rewards  in  this  model  are 
typically  assumed  to  be  costs,  either  the  cost  of  a  “look”  decision  or  the  likelihood  of 
being  incorrect  given  a  “leave”  or  “shoot”  decision.  Ahner  [3]  applied  approximate 
dynamic  programming  techniques  to  optimize  control  of  unmanned  aerial  vehicles  in 
combat  situations. 

1.3.5  Stochastic  Programming.  Avriel  and  Williams  [8]  derive  the  expected 
value  of  information  in  recourse  problems  and  show  the  value  of  a  wait-and-see 
approach  versus  a  recourse  method.  The  difference  between  them  is  that  in  a  recourse 
problem,  a  decision  is  made,  then  a  random  variable  is  observed,  and  then  a  recourse 
to  a  contingency  plan  is  determined.  A  wait-and-see  approach  supposes  that  one 
could  see  what  the  random  variable  is  before  one  makes  an  initial  decision,  and 
maximize  our  initial  decision  based  on  the  known  data  rather  than  the  unknown 
random  variable.  Certainly,  the  expected  profit  from  the  wait-and-see  approach  must 
be  at  least  as  great  as  in  the  recourse  problem  case.  Then  they  pose  a  suggestion 
that  if  they  could  purchase  the  perfect  information,  how  much  should  they  pay 
for  perfect  information?  EVP  I  =  WS  —  RP  yields  the  expected  value  of  perfect 
information.  Avriel  and  Williams  prove  that  EVP  I  >  0  given  that  an  expected 
value  of  the  random  variable  and  the  indicated  maxima  exist.  Further,  they  show 
that  0  <  EVP  I  <  EV  —  RP.  EVPI  can  be  applied  to  stochastic  linear  problems 
with  recourse  and  more  general  stochastic  programs  including  those  with  quadratic 
recourse. 


1.3. 5.1  Two-Stage  Stochastic  Linear  Program.  Kali  and  Wallace  [51] 
formulate  the  two-stage  stochastic  linear  program  as: 
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nun 


(6) 


cTx  +  Q(x) 

subject  to  Ax  =  b,  x  >  0, 

where  Q(x)  =  E  P*Q(x,?) 

j 

and  Q(x,£)  =  mm{q(£)Ty\W(t)y  =  h(£)  -  T(£)x,y>  0}, 
where  is  the  probability  that  £  —  the  jth  realization  of  £,  h(£)  =  h0  +  ih£  = 

+  Ei  T(0  =  To  +  E;  and  ?(0  =  9o  +  E*  ?*&■ 

Higle  and  Sen  [46]  present  an  algorithm  for  two-stage  linear  programs  with 
recourse  that  leverages  off  of  Benders’  decomposition  whereby  they  randomly  gener¬ 
ate  observations  of  random  variables  to  construct  statisical  estimates  of  supports  of 
the  objective  function.  Gassmann  [40]  presents  a  computer  code  for  the  multistage 
stochastic  linear  programming  problem  that  uses  an  implementation  of  a  nested 
decomposition  algorithm. 

Interior  point  methods  are  also  considered  by  Birge  and  Holmes  [16] ,  Lustig,  et 
al.  [65],  and  Dantzig  and  Madansky  [28].  Birge  and  Qi  [17]  are  credited  with  applying 
Karmarkar’s  [52]  interior-point  method  to  stochastic  programming.  They  formulate 
the  stochastic  linear  program  with  recourse  (and  discrete  random  variables)  as: 
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mm 


(7) 


cTx0  +  Q(x  o) 
subject  to  AqXq  =  b0 

x0  >  0, 

N 

where  Q(x  0)  =  ^PiQjx  0,C) 

i=  1 

and  for  each  %  —  1, . . . ,  N, 

the  recourse  cost  Q{xo,C)  is  obtained  by  solving  the  recourse  problem: 

Q(x0,C)  =  inf {qly\Wy  =  ti  -  Tx,y  G  Rn>+}, 

C  =  (q\h\T), 

Pi  =  prob  [f(w)  =  C]- 

Birge  and  Qi  noted  that  the  dual  block  angular  linear  programs  have  the  form: 

N 

min  cTx  0  +  cjxj  (8) 

i= 1 

subject  to  A0x0  =  b0 

AiX0  +  WiXi  =  bh  i  =  1, . . . ,  N, 

Xi  >  0,  i  —  0, . . . ,  N 


where 

Xi  G  Rni,i  =  0, . 

..,N, 

bi  e  Rmi,i  =  0, 

...,N, 

where 

m*  <  Uj,  i  =  0, . 

■  ■ ,  N 

and 

O 

3 

1 

£ 

row  rank. 

By  substituting  the  expressions  for  Q  into  the  stochastic  formulation  a  linear  pro¬ 
gramming  formulation  is  created  with  W  =  Wt ,  Tl  =  At,  and  piq 1  =  c*  for  i  = 
1, ...  ,N.  The  resulting  problem  has  n  =  n0  +  Nn \  variables  and  m  =  m0  +  Nrn \ 
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constraints.  As  Birge  and  Qi  point  out,  methods  for  solving  the  linear  formula¬ 
tion  include  Van  Slyke  and  Wet’s  L-shaped  method  [102],  Dantzig  and  Mandansky’s 
decomposition  method  [28],  and  the  basis  factorization  method  proposed  by  Straz- 
icky  [95].  The  L-shaped  method  solves  the  primal  problem,  while  the  decomposition 
and  factorization  methods  solve  the  dual  formulation. 

Lustig  et  al.  [65]  base  their  work  on  scenario  analysis,  where  a  few  realizations 
of  the  stochastic  parameters  are  representative  of  the  space  of  possible  parameter 
outcomes.  For  a  two-stage  model,  the  size  of  the  optimization  problem  grows  linearly 
and  typically,  due  to  the  size,  decomposition  methods  are  used.  They  point  out 
that  interior  point  methods  make  solving  these  large  resulting  models  feasible.  They 
implemented  a  primal-dual  interior  point  method  similar  to  the  one  described  earlier, 
in  which  they  show  that  the  primal-dual  method  performs  significantly  better  than 
Birge  and  Qi’s  [17]  dual  block  angular  approach.  Additionally,  they  propose  a  partial 
splitting  method  which,  due  to  the  sparsity  of  the  A  matrices,  speeds  up  the  interior 
point  methods  considerably. 

Birge  and  Holmes  [16]  formulated  a  dual  affine  algorithm  starting  with  a 
dual  feasible  interior  point,  noting  that  the  vast  majority  of  the  computational  ef¬ 
fort  required  is  to  calculate  a  solution  to  the  symmetric  positive  definite  system 
(AD2AT)dy  =  b)  or  to  calculate  a  factorization  of  the  matrix  that  will  enable  quick 
solution  of  the  system.  Further,  Blomvall  and  Lindberg  [18]  present  Riccati-based 
primal  interior  point  solver  for  multistage  stochastic  programming. 

Birge  and  Holmes  [15]  also  present  a  paper  on  the  motivation  for  use  of  interior 
point  methods  for  solving  two-stage  stochastic  linear  programs  with  fixed  recourse 
along  with  characteristics  of  interior  point  solving  methods.  Additionally,  they  note 
that  the  size  of  stochastic  linear  programs  can  become  extremely  large  due  to  the 
number  of  permutations  of  the  unknown  variables,  and  thus  they  present  methods 
for  speeding  up  the  interior  point  methods,  including  reformulation  of  the  program, 
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transpose  product  factorization,  and  factorization  of  the  dual  block  angular  pro¬ 
grams. 

1.3.6  Markov  Decision  Process.  A  Markov  decision  process  (MDP)  is  a 
problem  in  which  there  is  a  decision  maker,  a  finite  number  of  policies  or  choices 
the  decision  maker  can  choose,  a  transition  probability  matrix  which  defines  the 
likelihood  of  the  next  state  given  the  current  state  and  policy,  a  transition  reward 
matrix  which  indicates  the  current  reward  gained  for  the  state  and  policy,  and  a 
performance  metric  based  on  the  rewards  gained  during  the  stages  of  the  MDP  [42], 

S,  a  finite  state  space  of  possible  system  states.  A  realization  of  the  random  variable 
S  is  denoted  by  s. 

A,  a  finite  set  of  actions.  A  realization  of  the  random  variable  A  is  denoted  by  a. 
An  action  a  causes  transitions  from  the  current  state  to  some  new  state. 

T  :  S  x  A  x  S  — y  R[0ii]  is  the  state-transition  function,  giving  the  probability  that  the 
agent  transit  to  state  s'  when  it  is  in  state  s  and  takes  action  a.  In  other  words,  the 
transitions  specify  how  each  of  the  actions  and  exogenous  events  change  the  state  of 
the  world.  We  denote  by  T(s,  a,  s')  =  P(s'\a,s)  this  probability.  We  have  for  each 

A  E*'  p(s'\a,s)  =  !• 

R  :  S  x  A  — y  R  is  the  reward  function  giving  the  expected  immediate  reward  gained 
by  the  agent  for  taking  each  action  in  each  state. 

Markov  decision  process  applied  to  patient  throughput  in  hospitals  was  re¬ 
searched  by  Broyles  [22],  while  Qiu  and  Pedram  [83]  looked  at  the  Markov  decision 
process  for  continuous-time  decision-making. 

1.3.7  Partially- Observable  Markov  Decision  Process.  When  the  agent  is 
unsure  of  the  state  s  that  he  is  currently  in,  unlike  the  MDP  where  the  agent  knows 
where  he  is  at  all  times,  this  problem  becomes  a  partially-observable  Markov  decision 
process  (POMDP).  There  is  some  probability  distribution  around  the  state  in  which 
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the  agent  thinks  he  is  in  (the  belief  state  b).  McAllister  [68]  looked  at  optimal 
planning  with  imperfect  information,  such  as  what  U.S.  troops  have  on  battlefields. 

O  :  S  x  A  — y  n(h2)  is  the  observation  function,  which  gives,  for  each  action  and 
resulting  state,  a  probability  distribution  over  all  possible  observations  (we  write 
0(s',a,o)  for  the  probability  of  making  observation  o  given  that  the  agent  took 
action  a  and  landed  in  state  s'). 

Monahan  [72]  introduces  a  system  where  one  of  three  decisions  may  be  made. 
Either  the  observer  can  “inspect”  -  attempt  to  observe  the  true  state  of  the  target 
another  time  (at  a  cost),  “stop”  -  make  a  determination  as  to  the  true  state  of  the 
target  and  have  no  option  for  further  observation,  or  “continue”  in  which  he  moves 
to  the  next  time  interval  (at  a  cost)  where  the  same  three  options  will  again  be 
available  to  him.  In  the  next  time  interval  there  is  some  probability  that  the  nature 
of  the  target  has  changed  which  is  a  difference  from  the  assumptions  in  this  paper. 
Monahan  concluded,  “While  the  Markovian  property  does  not  hold  for  the  state  of 
the  system,  it  does  hold  for  the  belief  state  of  the  system.  The  optimal  policy  for 
any  given  stage  is  only  dependent  on  the  current  belief  state  and  not  decisions  made 
in  previous  stages.” 

Monahan  [73]  later  looked  at  the  applications  and  theory  behind  partially 
observable  MDPs.  Kaelbling  et  al.  [50]  and  Smallwood  and  Sondik  [92]  looked  at 
optimal  decision  policies  in  partially  observable  MDPs.  Yost  and  Washburn  [105] 
applied  linear  programming  techniques  for  decision  making  within  POMDPs. 

1-4  Overview  of  Literature  Review 

The  following  figures  provide  an  overview  of  the  topics  covered  and  method¬ 
ologies  implemented  in  the  following  chapters. 
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Figure  2:  Motivation  and  Background  Literature  Review  Summary 
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Figure  3:  Methodology  Literature  Review  Summary 

1.5  Description  of  Research 

This  dissertation  first  seeks  to  understand  the  nature  of  both  airdrops  and 
airstrikes  in  terms  of  the  parameters  and  distributions  that  accurately  represent 
all  aspects  of  these  missions.  Once  the  parameters  are  understood,  a  formulation 
for  each  of  these  problems  is  sought.  Optimization  techniques  and  algorithms  will 
then  be  created  to  minimize  collateral  risk  while  adhering  to  mission  and  logistical 
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constraints.  Finally,  sensitivity  analysis  along  with  lessons-learned  will  be  presented 
to  provide  take-aways  for  mission  planners  acting  in  these  environments  under  strict 
time  and  mission  requirements. 

1.6  Statement  of  Original  Contribution 

This  dissertation  seeks  to  fill  in  gaps  in  both  the  literature  and  the  methodology 
by  which  the  USAF  estimates  and  minimizes  collateral  risk.  The  major  contributions 
presented  in  the  following  three  chapters  are: 

•  Characterization  of  airdrop  distribution  based  on  real-world  data. 

•  Formulation  of  the  collateral  damage  problem. 

•  Comparison  of  non-linear  programming  algorithms  for  solving  the  airdrop  col¬ 
lateral  damage  minimization  problem. 

•  Algorithm  for  quickly  finding  optimal  airdrop  parameters  based  on  a  surrogate 
function  for  the  bivariate  normal  distribution. 

•  Multi-objective  formulation  for  the  airstrike  collateral  damage  problem. 

•  Algorithm  for  finding  Pareto  optimal  solutions  for  the  airstrike  problem. 

•  Quantitative  comparison  of  damage  functions  for  use  in  estimating  collateral 
risk. 

•  Quantitative  comparison  of  weapon  employment  guidelines  in  the  collateral 
airstrike  problem. 

•  Formulation  of  limited  intelligence  airstrike  problem. 

•  Quantitative  comparison  of  effects  of  weighting  and  a  priori  intelligence  on 
optimal  firing  policy. 
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II.  Minimizing  Supply  Airdrop  Collateral  Damage  Risk 

2. 1  Background 

2.1.1  Introduction.  Supply  airdrops  occur  for  a  variety  of  reasons.  Supplies 
are  airdropped  to  scientists  at  the  South  Pole,  Humvees  to  American  troops  at 
forward  operating  bases  in  the  mountains  of  Afghanistan,  and  food  and  water  to 
Haitians  in  the  days  after  their  devastating  2010  earthquake. 

The  necessity  of  airdrops  as  part  of  emergency  disaster  relief  is  underscored 
in  [91],  [27],  [62],  Shortly  after  the  2010  Haitian  earthquakes,  for  example,  United 
States  Air  Force  (USAF)  planes  were  dropping  over  200  water  and  food  bundles  per 
day  outside  Port-au-Prince  [24]  from  100  daily  flights  [66].  Bottlenecks  on  the  roads 
in  Haiti  along  with  the  blockage  of  the  seaport  prevented  the  movement  of  critical 
supplies,  forcing  the  primary  source  of  aid  to  be  airdrops  into  secured  areas  [66]. 
The  airdropped  supplies  helped  minimize  widespread  violence  and  looting  in  the 
days  following  the  earthquake. 

Supply  airdrops  are  typically  used  in  cases  where  plane  landings  are  either 
unsafe  or  inefficient.  In  the  first  four  months  of  2011,  the  USAF  dropped  25  million 
pounds  of  supplies  for  troops  and  locals  in  Afghanistan  and  Iraq.  This  was  not  pos¬ 
sible  by  truck.  Supply  airdrops  constitute  a  vital  tactical  piece  of  both  war-fighting 
and  peacekeeping  missions  for  the  USAF  throughout  the  world  and  consequently  the 
USAF  has  developed  expertise  potentially  useful  to  other  supply  airdrop  agencies. 
Airdrops  allow  ground  units  to  operate  in  areas  that  are  not  tied  to  ground  logistical 
resupply.  Aerial  resupply  allows  the  freedom  of  movement  without  worrying  about 
convoys  and  their  large  logistical  footprint.  [60] 

Supply  airdrops  have  risks  beyond  those  of  the  equipment  and  personnel  in¬ 
volved.  A  recent  challenge.gov  request  [26]  underscored  the  danger  that  comes  with 
dropping  humanitarian  food  and  water  supplies  over  populated  areas  (where  they 
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may  be  in  highest  demand)  and  the  need  to  develop  alternative  methods  of  per¬ 
forming  such  drops.  With  the  uncertain  flight  paths  of  airdrops,  along  with  their 
weight  (up  to  thousands  of  pounds),  airdrops  are  particularly  dangerous  ventures 
when  occurring  even  in  sparsely-populated  areas.  Poorly  planned  or  executed  air¬ 
drops  can  result  in  lost,  ruined,  or  stolen  cargo  and,  more  importantly,  collateral 
damage  to  the  people  and  buildings  near  the  drop  zone.  This  is  compounded  by 
the  fact  that  an  airdrop  is  typically  not  a  single  object  -  rather  a  series  of  objects 
referred  to  as  a  bundle.  This  article  develops  a  new  technique  for  estimating  risk  of 
collateral  damage  associated  with  supply  airdrops  and  an  efficient  method  for  finding 
the  optimal  aimpoint  and  approach  direction  for  supply  missions  so  as  to  minimize 
collateral  damage.  We  demonstrate,  based  on  real-world  drop  data,  that  only  an 
estimate  of  the  standard  deviations  in  the  x-  and  y-  directions,  with  respect  to  flight 
path,  is  required  to  estimate  the  expected  risk  of  collateral  damage  during  a  supply 
mission.  The  standard  deviation  parameters  fit  a  bivariate  normal  distribution  that 
characterizes  the  error  of  a  drop.  Risk  of  damage  is  estimated  by  integrating  the 
bivariate  normal  distribution  over  the  areas  of  undesirable  landing  locations  in  the 
drop  zone  for  each  object  in  the  supply  bundle  dropped. 

Once  an  estimate  of  collateral  damage  risk  is  established,  the  goal  becomes  to 
find  the  aimpoint  and  flight  approach  angle  which  minimize  collateral  damage  yet 
result  in  a  drop  as  close  to  the  recipients  as  possible.  We  must  also  accommodate  the 
reality  that  different  elements  in  the  scene  may  have  different  values  of  avoidance  (e.g. 
an  occupied  building  versus  a  lake).  The  nature  of  this  search  is  highly  non-linear 
because: 

•  of  the  shape  of  the  bivariate  normal  distribution, 

•  each  object  in  the  bundle  has  its  own  drop  error  distribution,  and 

•  each  element  in  the  scene  has  its  unique  location,  shape,  size,  and  value. 
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To  develop  an  effective  global  search  technique  for  this  problem,  response  sur¬ 
face  methodology  (RSM),  differential  evolution  (DE),  and  random  search  (RS)  meth¬ 
ods  are  compared  and  combined  in  this  paper  to  provide  quick  and  effective  tools  for 
finding  this  optimal  solution.  The  best  search  algorithm  will  be  shown  to  be  orders 
of  magnitude  faster  than  enumeration  and  up  to  20%  more  accurate  than  the  use  of 
maps  and  the  naked  eye. 

2.1.2  Nature  of  Airdrops.  Airdrop  accuracy  has  been  an  ever-present  chal¬ 
lenge  to  airdrop  supply  planners  since  the  early  days  of  resupply  via  aircraft  airdrop. 
Techniques  such  as  high-velocity  airdrops  for  rugged  cargo  minimize  the  effects  of 
wind  on  airdrop  trajectory  and  maintain  accuracy  while  allowing  for  higher  release 
altitudes  and  increased  aircraft  survivability.  “Reefing”  is  an  airdrop  beginning  de¬ 
scent  at  high  velocity  for  target  accuracy  and  then  switching  to  low  velocity  in 
mid-descent.  This  allows  aircraft  to  drop  cargo  from  higher  altitudes  with  the  accu¬ 
racy  of  a  lower  altitude  drop.  Many  of  these  techniques  and  technologies  were  born 
out  of  operational  necessity  and  can  be  used  in  combination  with  different  chute  and 
aircraft  types. 

One  of  the  most  successful  recent  examples  of  accuracy  improvement  is  the 
Joint  Precision  Airdrop  System  (JPADS).  JPADS  uses  a  steerable  parachute  and  an 
airborne  guidance  unit  to  control  the  cargo’s  descent  and  guide  it  to  its  desired  point 
of  impact  [69].  JPADS  offers  many  advantages  over  traditional  airdrops:  increased 
accuracy,  reduced  drop  zone  size  requirements,  standoff  cargo  release,  improved  air¬ 
craft  survivability,  and  immediate  feedback  on  airdrop  accuracy  [12].  A  disadvantage 
of  JPADS  is  its  cost  relative  to  traditional  “dumb”  airdrops  (which,  by  the  way,  com¬ 
prise  the  majority  of  supply  airdrops).  In  order  to  keep  costs  down  recovering  and 
reusing  retrograde  airdrop  items  is  necessary,  though  not  always  feasible  [12].  The 
challenge  is  that  an  agency  providing  supply  airdrop  support  may  not  have  the  bud¬ 
get  or  access  to  the  best  techniques  and  may  need  to  do  the  best  they  can  with 


the  equipment  they  have.  This  is  a  main  motivation  for  the  development  of  our 
methodology. 

Regardless  of  drop  technology  used,  planners  must  choose  carefully  where  to 
target.  If  a  drop  is  too  far  from  the  point  of  use,  recovery  personnel  could  be  exposed 
to  hazard  and  delay  in  getting  relief  to  the  recipients.  If  it  is  too  close  to  ground 
personnel  or  collateral  objects,  then  the  consequences  of  cargo  weighing  several  tons 
traveling  at  speeds  of  over  50  feet  per  second  are  unacceptable.  How  do  mission 
planners  know  how  close  is  too  close?  What  is  the  chance  that  the  cargo  will  impact 
a  collateral  object  inside  the  drop  zone? 

Airdrop  errors  occur  when  an  airdrop  does  not  land  at  its  intended  point  of 
impact.  These  errors  are  commonly  described  as  a  distance  from  the  drop  target  and 
an  angle  with  respect  to  the  drop  zone  axis  or  by  (x,  y )  coordinates.  These  errors 
can  be  caused  by  problems  with  the  computed  air  release  point,  flight  path  error, 
drop  crew  error,  drop  zone  elevations,  cargo  ballistics,  load  weight,  or  unpredicted 
winds.  While  the  calculation  of  a  release  point  takes  into  account  many  factors 
(summarized  graphically  in  Figure  4)  to  determine  the  correct  location  in  the  air 
to  release  an  airdrop  from  the  aircraft,  individual  drops  are  always  subject  to  drop 
error.  In  the  next  section,  we  characterize  those  errors  probabilistically. 
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Figure  4:  Drop  Zone  Planning  Diagram  [99] 

2.1.3  Bivariate  Normal  Distribution  of  Airdropped  Bundles.  We  find  that, 
under  a  wide  range  of  drop  conditions  and  technologies,  the  drop  errors  from  supply 
bundles  fit  the  bivariate  normal  distribution.  The  data  set  we  studied  was  provided 
by  the  USAF  Air  Mobility  Command  (AMC).  It  is  actual  (as  opposed  to  practice 
run)  data  from  over  700  airdrops  in  the  held.  Figure  5  shows  a  plot  of  all  of  the 
data.  It  is  not  GPS  data.  You  can  almost  imagine  the  ground  spotter  radioing  that 
a  particular  drop  was  “50  meters  long  at  your  2  o’clock”.  The  data  set  has  unique 
characteristics  which  we  studied  at  length.  For  more  detail  on  that  analysis,  see  [23] . 
Our  main  findings  follow. 
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Figure  5:  Airdrop  Scatter 

We  find  that  the  mean  errors  in  both  the  x  and  y  direction  are  statistically 
zero.  On  average  (and  this  result  remained  when  we  parse  the  data  into  different 
technologies  and  conditions),  the  planners  hit  where  they  aimed.  However,  there  is 
substantial  variation.  Further,  the  x  and  y  directional  errors  have  different  standard 
deviations.  This  makes  sense  because  typically  the  timing  of  an  airdrop  affects  the 
y  direction  whereas  wind  has  the  majority  of  the  effect  on  the  x  miss  distance.  We 
also  find  that  drop  errors  in  the  x-dimension  are  uncorrelated  from  those  in  the 
y-dimcnsion  (i.e.  p  =  0)  which  makes  risk  calculations  using  the  bivariate  normal 
distribution  simpler,  but  if  p  ^  0  then  the  same  algorithms  would  be  used;  run  time 
would  simply  be  longer. 

It  is  worth  stopping  here  to  consider  the  implication  of  these  findings  to  an 
airdrop  planner.  The  only  data  necessary  to  characterize  the  errors  in  a  supply  drop 
are  the  standard  deviations  in  the  x  and  y  directions  for  the  equipment  being  used 
in  a  particular  drop  zone.  This  can  be  accomplished  with  relatively  few  data  points 
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after  making  relatively  few  flights.  Practice  drops  could  even  be  done  away  from  the 
drop  site  to  assess  accuracy. 

As  an  example,  we  can  characterize  the  types  of  supply  airdrops  that  AMC 
made  in  our  data  set.  The  standard  deviations  for  combinations  of  chute  type  and 
airdrop  altitude  are  summarized  in  Figure  6,  along  with  the  number  of  data  points 
collected  for  each  combination.  We  find  that  the  chute  type  and  the  airdrop  altitude 
have  a  statistically  significant  effect  on  the  error  distribution  patterns,  but  not  the 
aircraft  type.  An  AMC  planner  merely  looks  up  a  value  pair  from  the  right  two 
columns  of  the  figure  to  characterize  fully  the  shape  of  the  risks  of  their  drops. 


Chute  Type 

Altitude 

#  of  Data  Points 

ox  (meters) 

<7V  (meters) 

HV 

1000’ 

6 

70.2 

106.5 

2000’ 

79 

77.3 

114.3 

3000’ 

62 

118.8 

126.2 

LV 

1000’ 

321 

101.3 

144.1 

2000’ 

113 

99.1 

155.8 

3000’ 

74 

175.4 

188.2 

LCLA 

1000’ 

21 

28.3 

54.2 

Figure  6:  Standard  Deviation  Table 


To  give  the  reader  a  sense  of  what  the  risk  profiles  look  like,  Figure  7  depicts 
the  density  function  for  seven  individual  objects  in  a  bundle  airdrop  mission.  The 
distance  (five  units)  between  the  elements  is  found  by  multiplying  the  aircraft  speed 
at  drop  by  the  time  interval  between  releases.  Figure  7  shows  the  effect  of  increasing 
standard  deviation.  When  the  standard  deviation  of  the  drop  objects  is  low,  their 
individual  probability  distributions  can  be  easily  identified  as  multiple  modes.  On  the 
other  hand,  with  a  small  separation  distance  relative  to  the  standard  deviations,  the 
graph  becomes  smoother  and  the  bundle  drop  error  profile  approaches  unimodality. 
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Figure  7:  Example  Multiple  Bivariate  Normal  Distributions 
(d  —  5,  n  —  7,  ax  =  ay  =  1,  5, 10) 

One  of  these  bundle  shapes,  dropped  into  the  landing  scene,  is  how  we  char¬ 
acterize  collateral  damage  risk.  Numerically,  we  integrate  the  compound  bivariate 
normal  distribution  over  the  undesirable  landing  areas.  In  the  next  section,  we  begin 
the  development  of  an  optimal  location  algorithm  based  on  bivariate  bundle  risks. 

2.2  Classes  of  Applicable  Global  Search  Algorithms 

It  is  not  obvious  how  to  design  an  optimum-seeking  algorithm  for  this  problem. 
It  is  one  of  the  contributions  of  this  paper.  In  this  section,  we  introduce  several 
important  candidate  global  search  algorithms. 

2.2.1  Random  Search.  Random  search  technique  is  proposed  by  Solis  and 
Wets  [93]  to  find  global  minima  in  optimization  problems  expanding  on  the  work 
of  Anderson  [4],  Rastrigin  [84]  and  Karnopp  [53].  The  Solis  and  Wets  algorithm 
uses  normally  distributed  steps  to  generate  new  points,  the  response  value  of  the 
point  is  calculated  and  if  the  newly  generated  point  has  a  higher  (worse)  objective 
function  value,  then  steps  are  taken  from  the  initial  point  in  the  opposite  direction 
to  find  a  new  point.  If  both  of  these  new  points  are  worse  than  the  original  point 
then  a  new  starting  point  is  generated.  Hart  [44]  notes  that  the  Solis  and  Wets 
algorithm  lacks  definitive  stopping  criteria  that  yield  optimality,  typically  relying  on 
a  fixed  number  of  iterations.  However,  their  work  is  particularly  useful  in  situations 
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where  function  characteristics  are  difficult  to  compute,  when  the  response  function 
is  “bumpy”,  when  processing  time  is  limited,  or  when  it  is  highly  desirable  to  find 
a  global  minimum  among  a  large  number  of  local  minima.  The  assumption  for 
the  response  function  is  that  it  is  continuous,  since  a  discontinuous  function  could 
conceivably  have  a  minimum  at  a  discontinuous  point,  which  would  be  (nearly) 
impossible  to  find  without  an  exhaustive  search  of  every  point  in  the  input  space.  All 
of  these  characteristics  are  exactly  the  conditions  of  the  collateral  damage  problem 
assuming  bivariate  normally  distributed  bundle  drops. 

Niederreiter  [78]  presents  quasi-Monte  Carlo  methods  for  generating  a  sequence 
of  uniformly  distributed  random  points  spread  on  a  space.  Estimates,  using  the 
variance  of  these  random  points,  can  be  made  for  the  value  of  the  minima  over 
the  searched  area  and  local  search  methods  can  be  used  in  conjunction  with  these 
quasi-Monte  Carlo  techniques,  however,  global  minimization  cannot  be  guaranteed 
on  an  objective  function  and  domain  without  a  priori  information.  Hart  [44]  notes 
importantly:  “In  general,  methods  that  utilize  a  priori  information  about  a  problem 
will  outperform  general  purpose  methods  that  utilize  less  information.”  For  example, 
a  method  to  specihcally  find  minima  for  the  bivariate  normal  problem  could  use 
random  search,  but  take  advantage  of  solving  a  more  specific  problem  than  a  general 
search  or  general  algorithm  is  made  to  solve. 

2.2.2  Response  Surface  Methodology.  Another  important  basic  approach 
for  approximating  response  functions  is  proposed  by  Myers  and  Montgomery  [74] 
with  y  =  /(£■  i,£2,  •  •  • ,  £fc)  +  e  where  /  is  the  true  response  function,  which  is  either 
unknown  or  complicated,  e  in  the  function  for  this  work  will  represent  sources  of 
variation  that  are  not  accounted  for  by  the  derived  model.  £i,£2,  ■  ■  ■  ,6c  will  be  the 
input  values  for  our  model;  in  the  airdrop  model  these  are  the  aimpoint  (x,  y)  and 
the  approach  angle. 
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Myers  and  Montgomery  further  discuss  the  sequential  nature  of  response  sur¬ 
face  methodology  whereby  initial  hypotheses  regarding  the  important  input  variables 
take  place,  often  backed  up  by  a  screening  experiment.  The  screening  experiment 
identifies  the  variables  affecting  the  response  variable  and  which  variables’  effects 
should  be  investigated  further.  After  screening  takes  place,  they  recommend  the 
use  of  a  first-order  model  and  the  method  of  steepest  descent,  where  starting  from 
an  initially  small  portion  (referred  to  by  Myers  and  Montgomery  as  the  region  of 
interest)  of  the  overall  search  space,  the  user  begins  to  move  in  the  direction  of  the 
optimal  combination  of  input  variables.  Iteratively  this  method  of  steepest  descent 
is  performed  until  a  minimum  for  the  response  function  is  found  in  the  local  region 
of  interest.  It  should  be  noted  that  the  minimum  found  by  this  technique  is  simply 
a  local  minimum  for  the  response  function  and  is  not  guaranteed  to  be  global. 

In  each  local  area  of  the  drop  scene,  we  need  to  find  a  local  minimum  if  we 
desire  to  obtain  the  global  minimum.  Therefore,  we  will  consider  the  response  sur¬ 
face  methodology  of  Myers  and  Montgomery.  Specifically,  when  the  random  search 
produces  top  candidates  we  will  use  RSM  to  improve  the  local  solutions. 

2.2.3  Differential  Evolution.  Storn  and  Price  [94]  present  the  differential 
evolution  heuristic  for  global  optimization  over  continuous  spaces,  sometimes  referred 
to  generically  as  genetic  algorithms.  Differential  evolution  does  not  rely  on  the  cost 
function  to  be  differentiable  or  even  continuous.  The  airdrop  problem  presents  a  con¬ 
tinuous  cost  function,  but  one  where  the  differentiation  of  the  cost  functions  has  no 
closed-form  solution.  Differential  evolution  is  a  parallel  direct  search  method  which 
utilizes  D-dimensional  (in  the  airdrop  problem,  3-dimensional)  parameter  vectors  as 
a  population  for  each  generation  G.  The  initial  group  of  vectors  is  chosen  randomly 
and  will  cover  the  entire  parameter  space.  DE  generates  new  parameter  vectors  by 
adding  the  weighted  difference  between  two  population  vectors  to  a  third  vector  in 
a  process  called  mutation.  The  mutated  vector  is  then  mixed  with  another  prede¬ 
termined  “target”  vector  to  yield  the  “trial”  vector.  The  objective  function  value  of 


35 


the  “trial”  vector  is  compared  to  that  of  the  “target”  vector.  In  the  selection  step, 
whichever  vector  has  the  lower  function  value  will  be  the  “target”  vector  for  the 
next  generation.  Each  generation  of  vectors  makes  improvement  in  the  cost  function 
value,  and  the  mutations  help  prevent  getting  trapped  in  local  minima.  Addition¬ 
ally,  keeping  DE  vectors  for  each  generation  prevents  trapping  in  the  local  minima. 
The  baseline  model  of  Storn  and  Price  is  DE /rand/ 1 /bin  meaning  that  the  initial 
vectors  are  randomly  chosen,  there  is  one  difference  vector,  and  the  crossover  scheme 
is  binary  distributed. 

For  constrained  differential  evolution,  constraints  are  dealt  with  by  the  inser¬ 
tion  of  a  boundary  penalty  into  the  objective  function  [94],  Michalewicz  and  Schoe- 
nauer  [70]  note  that  the  methods  for  dealing  with  constraints  in  a  genetic  algorithm 
can  be  handled  in  four  ways: 

•  methods  which  preserve  feasibility  of  the  solutions, 

•  methods  which  use  penalty  functions,  such  as  Storn  and  Price  [94], 

•  methods  which  make  distinctions  between  feasible  and  infeasible  solutions,  and 

•  hybrid  methods. 

Lampinen  [58]  discusses  the  laborious  and  difficult  nature  of  the  selection  of 
penalty  parameters,  and  proposes  a  method  that  either  preserves  feasibility  (if  the 
previous  generation’s  solution  was  feasible),  moves  towards  feasibility  (if  both  the 
current  and  previous  generations’  solutions  are  infeasible),  or  moves  towards  opti¬ 
mality  (if  both  generations’  solutions  are  feasible).  The  Lampinen  approach  doesn’t 
rely  upon  starting  solutions  which  are  feasible.  For  the  use  of  differential  evolution 
in  the  airdrop  problem  disallowing  solutions  that  fall  outside  of  the  feasible  region 
would  also  be  acceptable  since  the  differential  evolution  method  will  not  select  these 
disallowed  solutions  in  the  “selection”  step  of  the  algorithm.  Michalewicz  and  Schoe- 
nauer  would  classify  this  approach  as  a  method  which  makes  “distinctions  between 
feasible  and  infeasible  solutions.” 
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This  approach  appears  to  have  merit.  The  airdrop  problem  does  not  have 
overly  complex  constraints  that  preclude  the  easy  generation  of  multiple,  feasible 
starting  solutions,  and  the  objective  function  is  not  differential  -  both  properties 
well  served  by  evolutionary  approaches. 


2.3  Methodology 

2.3.1  Formulation  of  Optimal  Supply  Airdrop  Location.  Assuming  a  bi¬ 
variate  normal  distribution  with  known  parameters  ax,  cry,  and  p,  our  formulation 
is: 


min  IXHla- 


ti3'j  max  rVj  max 


r(x-xi)2  ( y-Vi )  ,  (y-Vi)2 , 

1^2  2 P—x  ^ +  ct2  1 


8,x,y  .  , 

y  J  =  1 


1=1 


j  min  dy.i  min  2iT^g xcr  y\f\  ~f? 

subject  to  9m in  <  9  <  9ma,x  (if  desired) 

*^min  F  X  F  XmlLX 
Umin  A  !)  T  Pmax 


2(1  -P)2 


dydx )) 

(9) 


where  m  is  the  number  of  collateral  objects 

Xi  =  x  +  s(i  —  1)  sind 

Vi —  y  +  s(i  -  1)  cos  9 

Vj  -  value  of  the  jth  collateral  object 

Xi  -  longitude  of  the  aimpoint  of  the  ith  bundle 

Pi  -  latitude  of  the  aimpoint  of  the  ith  bundle 

9  -  approach  angle 

<jx  -  horizontal  miss  distance  standard  deviation 
(Ty  -  vertical  miss  distance  standard  deviation 
n  -  number  of  objects  in  the  airdrop  bundle 
s  -  distance  of  separation  of  the  objects  in  the  airdrop 
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An  important  part  of  the  formulation  is  that  likelihoods  of  individual  collateral 
objects  being  struck  must  be  modified  to  account  for  the  possibility  of  multiple  hits. 
The  overall  probability  of  a  hit  is  found  by  ~Pi)  where  pt  is  the  likelihood 

of  an  individual  airdropped  object  striking  a  specific  collateral  object.  The  value  of 
the  collateral  objects  -  that  is  the  value  of  avoiding  them  -  can  be  set  to  any  positive 
value;  if  they  are  to  be  treated  equally  then  they  are  all  set  to  1.  The  solution  to  our 
formulation  is  the  aimpoint  and  approach  angle  which  minimizes  the  total  collateral 
value  of  the  bundles  striking  collateral  objects,  which  is  different  than  choosing  the 
aimpoint  and  angle  which  have  the  lowest  likelihood  of  striking  any  collateral  objects. 

A  final  constraint  is  added  to  avoid  unbounded  solutions  which  occur  any¬ 
where  outside  the  scene  or  potentially  at  the  edge  of  the  scene,  because  there  is  no 
collateral  damage  to  be  avoided  there.  Based  on  our  experience,  we  allow  solutions 
which  produce  only  airdrops  in  which  the  middle  of  the  bundle  lands  no  closer  than 
two  grid  lines  from  the  boundary  of  the  scene.  Remember,  however,  this  does  not 
guarantee  that  all  the  objects  in  a  bundle  will  actually  land  within  the  scene  due  to 
the  uncertainty  of  the  bundle  flight  paths. 

In  the  following  section  we  introduce  a  base  problem,  motivated  by  a  real-world 
airdrop  supply  scene.  It  is  intended  to  make  concrete  what  we  are  doing  and  be  the 
starting  point  to  develop  test  problems  for  the  specific  algorithms  we  will  develop 
and  compare  next. 

2.3.2  A  Drop  Zone  Problem  Solved.  Our  drop  zone  is  a  sparsely  populated 
setting.  This  is  typical  of  a  humanitarian  supply  drop  area  selected  to  be  near  a 
city,  but  not  in  the  city.  Figure  8  shows  the  scene  we  have  defined  with  (shaded) 
elements  to  be  avoided  and  the  optimal  drop  location  and  angle  found  (the  series  of 
circles).  Throughout  this  paper,  the  sizes  of  the  elements  shown  in  the  drop  scene 
are  to  scale,  while  the  sizes  of  the  circles  are  proportional  (but  not  to  scale)  to  the 
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magnitude  of  the  standard  deviations  of  drop  errors.  Let  us  discuss  setting  up  and 
solving  this  example  in  detail. 
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Figure  8:  Scenario  Layout  with  Optimal  Aiming  Location 

We  have  chosen  to  characterize  the  search  space  as  1000  x  1000  meters  with 
100  (10  x  10)  grid  zones  100  m  by  100  m,  with  the  requirement  that  the  middle 
of  the  bundle  object  lands  at  least  200  m  from  the  edge.  Keeping  the  size  of  the 
search  space  small  is  important,  both  because  it  determines  the  magnitude  of  the 
optimization  problem  and  it  bounds  the  area  where  the  recovery  team  will  have  to 
travel  to  acquire  the  dropped  supplies. 

In  this  base  problem,  an  airdrop  plane  traveling  at  120  meters/second  drops 
objects  0.5  seconds  apart.  This  yields  a  distance  between  the  bundle  objects  of 
60  meters  so  that  with  ten  objects  dropped  there  is  a  total  path  length  of  540  m. 
The  drop  technology  involved  has  standard  deviations  of  100  m  in  both  the  x  and 
y  direction  (typical  values  across  the  types  of  airplanes  and  chute  types  that  AMC 
uses).  Figure  9  collects  this  data  together  and  is  actually  the  format  of  an  input 
screen  for  our  solution  program. 
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Input  Parameters 

stdev  x 

100 

stdev  y 

100 

number  of  bundles 

10 

separation 

60 

north  boundary 

1000 

south  boundary 

0 

west  boundary 

0 

east  boundary 

1000 

Figure  9:  Input  Parameter  Table 


The  collateral  objects,  and  their  avoidance  values,  are  entered  via  another  ta¬ 
ble.  Figure  10  contains  the  data  for  our  base  problem  with  20  collateral  objects.  For 
the  purposes  of  this  paper,  collateral  objects  are  taken  to  be  rectangular  facing  the 
axes  of  the  coordinate  system.  (As  an  aside,  circular  buildings  are  well  represented 
by  a  square.)  Complex  shapes  can  be  built  with  several  rectangles.  A  larger  object 
could  in  fact  be  a  cluster  of  buildings.  This  scenario  intentionally  includes  a  variety 
of  objects  with  lengths  and  widths  between  10  and  100  meters.  We  are  envisioning 
every  collateral  object  as  being  occupied  housing  with  equal  avoidance  value.  Of 
course  if  an  object  is  known  to  be  just  a  barn,  its  value  could  be  decreased. 

Note  that  we  have  chosen  to  use  a  coordinate  system  of  the  cardinal  directions. 
This  is  not  required  but  it  makes  planning  with  maps,  GPS,  or  satellite  imagery  easier 
and,  regardless,  the  solution  algorithm  picks  its  optimal  angle  in  terms  of  whatever 
coordinate  system  is  selected. 
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West 

(Xjmin) 

East 

(Xjmax) 

South 

(y  jmin) 

North 

(y jmax) 

Value 

840 

870 

60 

130 

1 

80 

110 

230 

260 

1 

170 

240 

290 

300 

1 

400 

480 

770 

860 

1 

220 

300 

320 

340 

1 

650 

720 

340 

400 

1 

630 

710 

690 

730 

1 

310 

400 

190 

210 

1 

Figure  10:  Collateral  Objects 


In  terms  of  solution,  collateral  objects  and  their  values  determine  the  more 
attractive  drop  paths  and  locations.  Figure  8  shows  the  true  optimal  solution  in 
this  scenario.  Unsurprisingly,  the  optimal  aiming  location  for  the  bundles  lies  in  the 
rather  large  gap  in  the  buildings  on  the  western  side  of  the  layout.  Dropping  in  this 
location  yields  a  minimum  objective  function  value  of  0.088  which  means  that,  in 
an  individual  bundle  drop,  an  average  of  only  0.088  collateral  objects  will  be  struck 
(since  each  object  has  a  value  of  1)  by  the  ten  bundle  objects.  Note  that  there  is  a 
small  chance  that  some  of  them  may  be  struck  more  than  once. 

2.3.3  Solution  Methods.  In  this  section,  we  undertake  a  series  of  studies 
comparing,  combining,  and  evaluating  the  global  search  methods  of  Section  2.2  to 
solve  the  formulation  of  Section  2.3.1  for  problems  of  the  type  in  Section  2.3.2. 

2.3.3. 1  Surrogate  Functions.  Regardless  of  the  search  approach  used, 
calculations  of  the  cost  function  are  computationally  expensive  since  there  must  be  a 
complicated  integral  computed  at  each  point  on  the  grid  that  lies  within  a  collateral 
object.  Surrogate  functions  are  routinely  used  in  evolutionary  algorithms  when  the 
cost  function  is  complicated  and  requires  a  large  amount  of  processing  time.  This 
is  the  case  for  the  multiple  integrations  of  the  bivariate  normal  distribution  for 
the  airdrop  collateral  damage  problem.  To  combat  this,  for  the  first  number  of 
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generations  of  differential  evolution  we  will  use  a  surrogate  function  which  requires 
less  processing  time.  The  key,  however,  is  to  find  the  correct  number  of  generations 
at  which  to  make  the  switch  between  using  the  surrogate  cost  function  and  the  actual 
cost  function.  Using  the  surrogate  cost  function  too  long  will  result  in  convergence  to 
a  sub-optimal  solution  (a  solution  that  is  optimal  for  the  surrogate  function,  but  not 
the  true  cost  function).  Switching  to  the  actual  cost  function  too  soon  will  negate 
the  time  savings  gained  by  using  the  surrogate  cost  function. 

The  surrogate  function  created  for  the  integration  of  the  bivariate  normal  dis¬ 
tribution  will  be  based  on  an  approximation  of  its  probability  density  function  (pdf). 
By  experimentation,  we  find  that  just  four  rectangular  prisms  approximate  this  pdf 
adequately.  Figure  11  shows  graphically  the  normal  distribution  approximation  used 
for  the  surrogate  function. 
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Figure  11:  Surrogate  Approximation  of  the  Normal  Distribution 

As  evidence  of  the  accuracy  of  the  4-point  surrogate,  Figure  12  compares  the 
objective  function  value  of  the  surrogate  normal  to  the  actual  probability  value 
for  over  10,000  bundle  object  and  collateral  object  impacts.  The  graph  shows  low 
error  and  high  correlation  (0.894)  between  the  two  functions.  The  speed-up  of  the 
surrogate  does  not  deter  the  accuracy  of  the  search  to  optimum. 
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Figure  12:  Surrogate  versus  Actual  Objective  Function 

2. 3. 3. 2  Differential  Evolution  Algorithm.  We  have  implemented  the 
differential  evolution  method  described  in  Section  2.2.3  and  Figure  13  created  by 
Storn  and  Price  [94].  The  differential  evolution  algorithm  has  as  its  key  inputs  the 
number  of  generations  and  the  number  of  solutions  within  the  generations.  After  a 
series  of  trials  we  find  the  following  constants  for  the  differential  algorithm  provided 
convergence  to  one  solution  while  keeping  the  run  time  for  the  algorithm  low:  F  = 
0.8,  CR  =  0.5,  NP  =  40,  and  generations  =  100.  This  means  there  are  4,000  separate 
calculations  of  the  objective  function. 
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STEP  1:  (Initialization)  Set  F,  CR,  generations,  NP,  choose  “target”  population  of  NP  3 
dimensional  vectors  yielding  (  xus> .  x2  l  0 ,  x3l0 )  through  (.rlvf  0 ,  x2i!P  0 ,  x3!/rl> )  ^ 
k  =  0. 

STEP  2 :  (Mutation)  Create  “mutant”  population,  yielding  (v,  , k,  v24  k ,  v3 4  k )  through 

CVl,iW>,Jt9  V2^P,*>  V3JfF,k)  (Vl,i,,it9V2,ii,Jt9  V3,»i,*  )  — 

+  r0’ri’r2 

randomly  chosen  mutually  different  integers  from  1  to  NP. 

STEP  3:  (Crossover)  Create  “trial” population,  yielding  (a  u2)  k , u3 ,  k )  through 

(ul  fiPi,ul  sr  t,u3yf  t) ,  where  u,„k  =  xIJlk  if  rand(0,\)  >CR,  otherwise, 

Ui,n,k  =  Vl,»,k  ■ 

STEP  4:  (Selection)  Create  new  “target”  population,  if 

/(*!,», *>*2,,.*>*3.Mr)  <  “2,»,*^3,»,t)  the11 

(^l,,,t+1.^2,,ll+l»-»:3,,.t+l)  (.Xl,n,k’X2.n,k>X3,n,k) 

otherwise  -i+I, Jc2^,i+i.*3,,,i+l)  (“i,»^.“2,»,t.“3,».i)  otherwise 

(*l..Jt+l>*2jiJtU>Jt3J.Jt+l)  <-  (“l,»,t."2,,,t  =  «3,.,t)  *+l  ■ 

STEP  5:  (Termination)  If  k  =  generations  ,  STOP,  otherwise  go  to  STEP  2. 
F-amplification  factor,  CR-crossover  factor,  NP-solutions  per  generation,  x-target  population,  u- 
trial  population,  v-mutant  population,  k-current  generation,  n-current  solution,  i-current  dimension 


Figure  13:  Differential  Evolution  Algorithm 


2. 3. 3. 3  Differential  Evolution  with  Surrogate.  We  have  implemented 
two  variations  of  the  Differential  Evolution  algorithm  where  the  surrogate  function  is 
used  to  speed  np  the  process.  In  the  first  variation,  the  initial  50%  of  the  generations 
use  the  surrogate  function,  and  in  the  second  the  surrogate  function  is  used  in  100% 
of  the  generations.  Once  the  final  generation  of  solutions  is  obtained,  those  locations 
have  their  true  objective  function  values  calculated  and  then  compared  in  order  to 
determine  the  best  solution.  Random  starting  points  within  the  user-defined  grid 
initialize  both  algorithms. 

2. 3. 3. 4  Response  Surface  Algorithm.  For  the  response  surface  al¬ 
gorithm,  the  solution  space  is  initially  searched  in  order  to  End  good  candidate 
solutions.  This  step  is  accomplished  by  uniformly  searching  the  solution  space,  End¬ 
ing  the  best  ten  solutions  and  then  using  these  ten  solutions  as  inputs  to  the  second 
phase  of  the  algorithm.  The  second  stage  takes  each  of  these  ten  solutions  and  then 
determines  the  regression  equation  about  each  solution.  From  there  (again,  for  each 


44 


solution)  movement  is  made  in  the  direction  of  the  steepest  descent  by  a  given  step 
size.  Those  new  ten  solutions  then  become  the  inputs  and  the  process  is  repeated 
with  smaller  step  sizes,  for  a  given  number  of  iterations.  Finally,  the  ten  result¬ 
ing  solutions  are  compared  to  determine  the  best  solution.  For  the  response  surface 
method  in  our  base  problem  scenario,  for  example,  there  were  7*7*35  +  2*2*3*10*10 
=  2915  separate  calculations  of  the  cost  function  computed. 

Once  the  ten  best  solution  vectors  from  enumeration  are  found,  each  solution 
is  treated  as  the  starting  point  and  the  algorithm  in  Figure  14  is  performed: 


STEP  1:  (Initialization)  k  <—  O,(x0,_y0,#0)  e  D  (given),  choose  initial  search  radius  r0  and  step 
size  change  a . 

STEP  2:  (Steepest  Descent  Determination)  Determine  regression  equation  /  around 
(xk,yk,6k)  with  2?  factorial  design  f(x,y,0)  =  bo  +  b1x+b2y+b38. 

STEP  3:  (Calculate  New  Solution) 

Tt-i 

■u+i  yk  -  a  A 
At+i  v-  0k  -  A  kb} 

STEP  -I:  (Termination)  Itk  =  10  STOP,  otherwise,  rk+l  <—  ark ,  k  <—  k  + 1  and  go  to  STEP  2. 
k-current  generation,  f-regression  equation  about  a  given  point,  a-step  size  change  constant,  b- 
coefficients  of  regression  fiinction,  l  -step  size,  A-dimensional  step  size 


Figure  14:  Response  Surface  Methodology  Algorithm 

2. 3. 3. 5  Response  Surface  Methodology  with  Surrogate.  We  have  also 
programmed  the  response  surface  algorithm  with  surrogate  method  using  the  sur¬ 
rogate  objective  function  to  perform  the  enumeration  step  before  switching  to  the 
true  objective  function  for  the  response  surface  portion.  Thus,  the  best  ten  solutions 
from  the  surrogate  objective  function  are  found  and  on  these  solutions  the  response 
surface  algorithm  is  performed  (using  the  true  objective  function),  giving  movement 
towards  better  solutions  moving  from  these  ten  solutions  for  a  series  of  iterations. 
Examples  of  the  search  movement  are  shown  in  Figure  15.  Starting  points  are  at 
the  grid  intersections  and  the  points  are  the  midpoint  of  the  bundle  drop.  This 
figure  shows  the  unique  nature  of  search  evolution  in  our  problem  wherein  the  angle 
changes  during  the  search  and  not  merely  the  location. 
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Figure  15:  Steepest  Descent  Movement  in  RSM 

2. 3. 3. 6  Enumeration.  One  might  ask,  why  not  simply  evaluate  at 
many  points  in  the  drop  area  and  pick  the  best  one  of  those  as  the  estimate  of  the 
global  optimum?  The  problem  is  computational  effort  -  when  searching  the  entire 
solution  space  in  an  enumerative  manner,  the  size  of  the  steps  taken  has  a  significant 
effect  on  both  the  optimal  solution  found  and  the  number  of  trials  necessary  to 
find  the  optimal  solution.  As  the  step  size  approaches  zero,  the  number  of  trials 
approaches  infinity  at  a  rate  inversely  proportional  to  the  cube  of  the  step  size  (since 
this  is  a  three-dimensional  problem).  Additionally,  as  the  step  size  approaches  zero, 
the  solution  found  approaches  the  true  optimal  solution  for  the  scenario.  Figure  16 
shows,  in  fact,  this  tradeoff  in  our  base  scenario  problem.  Enumeration  can,  in  fact, 
find  the  optimal  solution  of  0.088.  The  problem  is  that,  for  our  example,  it  requires 
673,501  computations  of  the  objective  function. 
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Figure  16:  Optimal  Result  and  Calculations  vs.  Step  Size 

Nevertheless,  enumeration  offers  a  comparison  value  in  terms  of  accuracy  and 
speed.  We  have  gone  so  far  as  to  implement  enumeration  with  the  surrogate  tech¬ 
nique.  We  will  also  evaluate  the  behavior  of  enumeration  on  a  coarser  grid. 

2-4  Global  Search  Results 

In  this  section,  we  compare  all  of  the  algorithms  introduced  in  this  paper.  The 
merit  metrics  are  running  time  and  accuracy.  The  results  are  all  solutions  to  the  base 
scenario  of  Section  2.3.2;  however  our  computational  experience  leads  us  to  believe 
that  the  relative  performance  of  the  algorithms  is  the  same  on  other  problems  of  our 
type. 

In  addition,  we  investigate  the  behavior  of  solutions  for  a  wide  range  of  scenario 
variations  that  a  planner  might  face.  This  serves  the  purpose  of  validating  our 
work  and,  more  importantly,  reveals  tradeoffs  and  improvement  methods  that  we 
consistently  find. 
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2-4-1  Comparison  of  Search  Algorithms.  Figure  17  collects  the  results  of 
our  algorithm  comparison  studies.  As  discussed  earlier,  there  are  three  DE  algo¬ 
rithms,  three  enumeration  algorithms,  and  two  RSM  algorithms.  Let  us  first  look 
within  the  algorithm  types  and  then  across  them. 

The  DE  algorithm  consistently  finds  (and  should  by  its  design)  the  best  solution 
because  it  is  defined  in  terms  of  the  true  objective  function.  Surrogacy  speeds  the 
calculation  up,  but  the  quality  of  the  solution  can  suffer. 

The  enumeration  algorithm  benefits  the  most  from  surrogacy  because  of  the 
high  number  of  objective  function  evaluations  required.  Trying  to  improve  compu¬ 
tational  time  by  using  a  coarser  grid  does  not  work  as  the  solution  quality  inevitably 
suffers.  This  is  the  behavior  we  saw  in  Figure  16. 

The  RSM  algorithm  produces  high  quality  results  but  has  the  limitation  of 
always  approximating  the  true  objective  function.  This  is  compounded  by  the  sur¬ 
rogacy  approximation.  With  that  said,  RSM  still  produces  good  quality  solutions 
consistently  in  a  reasonable  time. 


Technique 

Computations  of 
Tine  Objective 

Computations  of 
Surrogate 

Objective 
Function  Value 

Location 

(x,  y,  8) 

Differential  Evolution 

4,000 

- 

0.088 

(26.7,51.6,  80.9) 

Differential  Evolution  with  50% 
Surrogate 

2,000 

2,000 

0.089 

(26.7,51.6,  80.9) 

Differential  Evolution  using 

100%  Surrogate 

- 

4,000 

0.094 

(27.0,  53.3,  88.0) 

Explicit  Enumeration 

673,501 

- 

0.089 

(27.0,  52.0,  83.0) 

Explicit  Enumeration  coarse  - 
50  meters,  5  degrees  of  angle 

6,253 

0.109 

(30.0,  50.0,  80.0) 

Explicit  Enumeration  using 
Surrogate 

" 

673,501 

0.092 

(27.0,  53.0,  87.0) 

Response  Surface  Methodology 

2,915 

- 

0.089 

(27.0,51.8,  80.9) 

Response  Surface  Methodology 
with  Surrogate 

1,200 

1,715 

0.089 

(27.0,51.5,  79.9) 

Figure  17:  Summary  of  Results  for  Techniques 


Figure  18  summarizes  the  computational  cost  versus  solution  accuracy  of  the 
algorithms.  Computational  cost  is  calculated  as  the  number  of  calculations  of  the 
true  objective  function  plus  25%  of  the  number  of  calculations  of  the  surrogate 
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function  (that  being  the  approximate  savings).  The  lower  left  corner  of  Figure  18  is 
the  most  desired  having  a  low  calculation  time  and  a  low  optimal  solution  value.  The 
computationally-intensive  method  of  enumeration  is  dominated  and  is  not  the  way 
to  approach  solving  this  class  of  problem.  The  remaining  Pareto  optimal  approaches 
then  are  differential  evolution,  response  surface  method  using  the  surrogate  function, 
and  the  differential  evolution  method  using  only  the  surrogate  function. 
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Figure  18:  Results  of  Various  Methods 

We  feel  that,  although  it  is  fast,  the  DE  with  100%  surrogate  gives  away  too 
much  in  terms  of  accuracy.  Either  of  the  other  two  Pareto  methods  is  a  good  choice, 
depending  on  the  tradeoff  for  the  planner  in  terms  of  speed  versus  accuracy.  For 
example,  using  a  2.7  GHz  desktop  PC  with  4.0  GB  RAM,  we  obtain  solutions  to  the 
base  problem  in  a  few  minutes  of  clock  time.  So,  choosing  pure  DE  might  mean  an 
8  minute  response  time  versus  a  2  minute  response  for  RSM  with  surrogate  (whose 
solution  might  be  1-2%  worse).  For  accuracy,  and  consistency  of  comparison,  we  use 
DE  for  all  the  remaining  case  studies  in  this  paper. 
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2-4-2  Guidelines  to  Airdrop  Planners.  In  this  section,  we  solve  fourteen 
variations  of  a  new  airdrop  problem  (Scenario  #1  below)  using  the  techniques  de¬ 
veloped  in  this  paper  to  explore  the  effects  of  changing  inputs  (the  drop  technology, 
the  scene  itself,  etc.).  The  variations  represent  real-world  situations  that  an  airdrop 
planner  might  face. 

Choosing  bundle  configurations: 

•  Using  bundles  of  size  five  rather  than  ten.  This  investigates  the  impact  of  two 
missions  of  five  versus  one  mission  with  a  size  ten  bundle  (Scenario  #2). 

•  Using  a  bundle  separation  distance  of  30  meters  rather  than  60  meters.  This 
shows  the  benefit  of  a  plane  traveling  slower  over  the  dropzone  (#3). 

Choosing  drop  technology  (the  standard  deviation  values  were  taken  from  Figure  6): 

•  Using  higher  (#4),  lower  (#5),  and  unequal  standard  deviations  (#6)  to  gener¬ 
ally  underscore  the  effects  of  accuracy  in  delivery  systems  on  collateral  damage. 

•  Specifically  using  an  LCLA  chute  at  1000  feet  (the  lowest  standard  deviations 
of  all  chute- altitude  combinations)  (#7). 

•  Specifically  using  an  LV  chute  at  3000  feet  (the  highest  standard  deviations  of 
all  chute-altitude  combinations)  (#8). 

•  Specifically  using  an  LV  chute  at  1000  feet  (the  most  common  chute-altitude 
combination)  (#9). 

•  Specifically  using  an  HV  chute  at  2000  feet  (the  most  common  HV  altitude) 

(#10). 

Effect  of  changes  in  the  nature  of  collateral  objects: 

•  Using  differing  values  for  the  collateral  objects  rather  than  all  collateral  objects 
having  the  same  values.  Collateral  objects  are  randomly  given  values  between 
0  and  2  rather  than  the  previous  common  value  of  1  (#11). 
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•  Using  smaller  collateral  objects  ranging  from  10  x  10  meters  to  only  30  x  30 
meters  rather  than  up  to  100  x  100  meters.  This  demonstrates  the  benefit  of 
more  accurate  intelligence  on  the  nature  of  the  collateral  objects  in  the  drop 
scene  (#12). 

•  Using  ten  collateral  objects  in  the  scene  rather  than  twenty  demonstrating  the 
benefit  of  moving  to  more  sparsely  populated  areas  (#13). 

Consequences  of  limitations  in  travel  over  the  drop  scene: 


•  Using  a  flying  angle  constraint  (this  is  the  “as  desired”  constraint  in  our  formu¬ 
lation).  This  demonstrates  that  weather,  terrain,  or  other  safety  flying  logistics 
can  have  detrimental  consequences  on  drop  risk.  (#14) 

The  inputs  and  results  for  all  scenarios  are  presented  in  Figure  19,  wherein  the 
last  column  is  the  damage  value  at  optimum  (lower  is  better).  Like  scenarios  are 
grouped  together. 


# 

Scenario 

r>i  (m) 

«y  (m) 

n 

s 

X 

y 

e 

value 

1 

standard 

100 

100 

10 

60 

465.7 

438.6 

56.09 

0.206 

2 

fewer  bundles 

100 

100 

5 

60 

435.6 

468.7 

81.81 

r 

i 

® 

® 

3 

smaller  separation 

100 

100 

10 

30 

466.1 

472.8 

81.36 

0.154 

4 

higher  standard  deviation 

150 

150 

10 

60 

362.9 

200.8 

-171.59 

0.308 

5 

lower  standard  deviation 

50 

50 

10 

60 

798.9 

625.0 

71.23 

0.015 

6 

unequal  standard  deviations 

130 

70 

10 

60 

555.8 

457.3 

-104.33 

0.266 

7 

LCLA  -  1000' 

28.3 

54.2 

10 

60 

796.4 

628.4 

69.78 

0.000 

8 

LV  -  3000' 

175.4 

188.2 

10 

60 

783.3 

791.7 

-8.30 

0.322 

9 

LV  -  1000' 

101.3 

144.1 

10 

60 

581.9 

513.1 

-123.05 

0.231 

10 

HV  -  2000' 

77.3 

114.3 

10 

60 

793.5 

622.9 

72.59 

0.128 

11 

unequal  collateral  values 

100 

100 

10 

60 

800.0 

250.5 

131.85 

0.136 

12 

smaller  collateral  objects 

100 

100 

10 

60 

475.4 

423.3 

56.05 

0.017 

13 

fewer  collateral  objects 

100 

100 

10 

60 

799.4 

799.8 

57.23 

0.006 

14 

constrained  approach  angle 

100 

100 

10 

60 

369.9 

269.7 

2.96 

0.265 

<tx  -  standard  deviation  in  x-direction,  <ry  -  standard  deviation  in  y-direction,  n-size  of  objects  in  bundle, 
x-optimal  horizontal  location,  v-optimal  vertical  location.  0-optimal  approach  angle 

Figure  19:  Summary  of  Results  for  Scenarios 
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Figure  19  demonstrates  that  cutting  the  standard  deviation  in  half  from  100 
meters  (#1)  to  50  meters  (#5)  lowers  the  expected  amount  of  collateral  damage  risk 
by  93%,  underscoring  the  need  for  accurate  delivery  systems.  The  vast  improvement 
possible  by  a  technologically- advanced  delivery  system  can  be  seen  in  the  near- zero 
damage  caused  when  using  the  LCLA  chute  type  from  1000  feet  (#7).  Conversely, 
the  LV  chute  type  from  3000’  (#8),  with  its  large  standard  deviations  has  a  collateral 
damage  risk  61%  higher  than  the  standard  case  (#1). 

Figure  19  also  shows  the  sharp  decline  in  expected  damage  from  lowering  the 
number  of  collateral  objects  (#13)  or  decreasing  the  size  of  collateral  objects  (#12). 
This  is  to  be  expected  (and  quantified  by  our  model).  It  is  more  surprising,  that  a 
50%  decrease  in  the  separation  (#3)  from  flying  slower  yields  only  a  small  decrease 
in  expected  collateral  damage  (25%). 

Let  us  turn  from  the  optimum  values  of  the  objective  function  to  the  changes 
in  the  location  and  angle  of  drop.  Figures  20  through  26  depict  the  optimal  aiming 
locations  and  angles  from  Figure  19  for  all  fourteen  scenarios: 
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Figure  20:  Varying  Bundles  -  #1,  2,  3 


Figure  21:  Varying  Standard  Deviation  -  #1,  4,  5,  6 
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Figure  22:  Chute  Type  -  Altitude  -  #7,  8,  9,  10 


Figure  23:  Varying  Collateral  Values  -  #1,  11 
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Figure  24:  Smaller  Collateral  Objects  -  #12 


Figure  25:  Fewer  Collateral  Objects  -  #13 
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Figure  26:  Constrained  Flying  Angle  -  #14 

•  While  any  change  in  the  inputs  may  result  in  a  vastly  different  area  of  the 
grid  in  which  to  drop,  in  the  examples  chosen  we  see  that  fewer  bundles  (# 2), 
smaller  bundle  separation  (# 3),  or  smaller  collateral  objects  (#12)  did  not 
greatly  move  the  aimpoint  from  the  standard  case  (#1). 

•  Changing  the  standard  deviations  (#4,  5,  6)  not  only  has  a  tremendous  effect 
on  expected  collateral  damage,  but  can  drastically  change  the  location  of  the 
optimal  solution. 

•  Changing  the  values  of  the  collateral  objects  (#11)  has  a  major  effect  on  op¬ 
timal  location.  This  is  true  even  if  the  average  collateral  object  value  is  the 
same.  Both  the  standard  scenario  (#1)  and  the  varying  value  scenario  (#11) 
have  average  collateral  object  value  of  1,  yet  #11  has  a  34%  lower  objective 
function  value. 
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•  Decreasing  the  size  of  the  collateral  objects  (#  12)  yields  little  movement  in  the 
optimal  aiming  location  from  the  standard  case  (#1),  but  over  a  90%  decrease 
in  the  expected  amount  of  collateral  damage. 

•  Fewer  collateral  objects  (#13)  results  in  50%  less  of  the  scene  covered  by  collat¬ 
eral  objects  and  smaller  collateral  objects  (#12)  resulted  in  89%  less  coverage 
of  the  scene.  However,  fewer  collateral  objects  results  in  less  damage  indicating 
that  it  is  better  to  have  more  collateral  object  area  that  is  concentrated  rather 
than  less  collateral  area  that  is  spread  out. 

•  The  constrained  angle  scenario  (#14)  demonstrates  the  potential  risk  when 
airdrop  flight  paths  are  restricted  by  weather,  terrain,  or  other  safety/logistics 
concerns.  In  this  scenario,  the  flight  paths  are  constrained  to  be  within  ten 
degrees  of  due  north.  From  the  standard  scenario,  we  see  a  29%  increase  in 
the  expected  amount  of  collateral  damage. 

2.5  Results  and  Discussion 

In  this  paper,  we  present  a  characterization  of  the  distribution  of  supply  air¬ 
drops  and  methods  for  optimally  dropping  them.  Specifically,  supply  airdrops  follow 
a  bivariate  normal  distribution  in  which  the  x  and  y  deviations  are  uncorrelated 
(p  =  0).  A  surrogate  approximation  function  for  the  bivariate  normal  distribution 
supports  quick  integration  of  the  distribution  to  assess  drop  risk.  RSM  with  surro¬ 
gate,  and  DE,  both  return  Pareto  optimum  results  depending  on  a  tradeoff  between 
runtime  and  accuracy.  Both  find  near-optimal  solutions  of  the  non-linear  program 
resulting  from  the  airdrop  problem,  quickly  finding  settings  for  both  an  airdrop 
location  and  an  approach  angle.  Enumeration  is  strongly  dominated  by  all  other 
algorithms. 

It  is  natural  to  ask  whether  an  expert  eye  is  a  substitute  for  algorithms.  It 
turns  out  not.  Suppose  an  airdrop  planner  who  has  been  shown  the  oval  shapes  and 
scales  of  a  bundled  set  of  a  supply  airdrop  could  predict  the  optimal  aimpoint  within 
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50  meters  in  each  direction  and  the  drop  angle  within  5  degrees  angle  of  the  optimal 
solution.  Note  that  this  is  a  high  standard  -  we  have  looked  at  hundreds  of  combi¬ 
nations  of  drops,  yet  still  only  approach  that  level  of  accuracy.  In  our  base  problem, 
where  the  collateral  objects  have  the  same  weighting,  the  planner  “eyeballing”  a 
solution  would  have  a  collateral  risk  14%  higher  than  the  optimum.  In  more  com¬ 
plicated  scenarios  where  the  collateral  objects  are  weighted  differently,  “eyeballing” 
a  solution  becomes  much  worse  than  the  solutions  found  by  our  algorithms,  with 
“eyeballed”  solutions  routinely  worse  by  20%  or  more.  A  more  reliable  technique 
must  be  implemented  to  limit  damage  and  ensure  recoverability. 

In  terms  of  future  work,  we  have  two  ideas.  First,  rather  than  using  bounds  on 
the  x  and  y  directions  to  limit  the  bundle  drop  zone,  an  attractor  function  could  be 
incorporated,  which  would  approximate  the  likelihood  of  recovery  as  a  distance  from 
a  given  point  (typically  the  middle  of  the  scene).  The  attractor  function  would  be 
weighted  and  added  to  the  objective  function  for  the  problem.  Second,  automating 
chute  selection  when  not  all  missions  would  be  able  to  use  the  most  accurate  types 
of  chutes  has  appeal.  For  example,  in  the  case  where  an  inventory  of  flights/chutes 
is  available  to  cover  a  set  of  drop  zones,  we  would  optimize  not  just  the  individual 
drop  but  the  portfolio  of  drops  determining  which  chutes  should  be  used  for  which 
mission  to  minimize  the  overall  risk  of  collateral  damage. 


III.  Pareto- Optimality  for  Lethality  and  Collateral  Risk  in  the 
Airstrike  Multi- Objective  Problem 

3. 1  Introduction 

Sources  estimate  that  at  least  6,000  to  9,000  civilian  casualties  [38]  [98]  [45] 
have  occurred  in  Afghanistan  since  the  beginning  of  the  Global  War  on  Terrorism 
(GWOT)  as  a  direct  result  of  Coalition  military  actions.  More  specific  to  the  United 
States  Air  Force  (USAF),  over  1,000  civilian  deaths  have  occurred  since  the  incep¬ 
tion  of  GWOT  due  to  air  strikes  [38].  In  addition,  property  damage  resulting  from 
airstrikes  in  Afghanistan  to  civilian-owned  buildings  has  alienated  some  local  resi¬ 
dents  and  ruined  goodwill  created  between  NATO  and  anti-Taliban  citizens  [89]. 

The  Chairman  of  the  Joint  Chiefs  of  Staff  Instruction  on  No-Strike  and  the 
Collateral  Damage  Estimation  (CDE)  Methodology  from  2009  [25]  gives  the  U.S. 
Military  its  guidance  on  the  subject  of  collateral  damage.  The  document  lays  out 
such  things  as  which  types  of  buildings/structures  are  typically  parts  of  a  no-strike 
list  and  under  which  circumstances  a  commander  may  fire  upon  buildings  known  to 
contain  collateral  objects  or  people.  The  document  touches  on  the  use  of  human 
shields  by  the  adversary,  special  restrictions  on  targets  which  may  cause  grave  en¬ 
vironmental  or  biological  concerns,  and  the  roles  of  personnel  within  the  targeting 
process. 

Of  primary  importance,  the  Instruction  provides  the  collateral  damage  method¬ 
ology  (CDM)  process  which  seeks  to  be  “simple  and  repeatable”  in  order  to  provide 
“a  reasonable  determination  of  collateral  damage  inherent  in  weapons  employment.” 
CDM  is  presented  in  Eve  levels  of  increasing  risk  of  collateral  damage.  A  target  will 
progress  from  level  1  until  such  point  as  it  is  no  longer  necessary  to  progress,  thus 
making  the  target  (and  associated  collateral  risk)  categorized  as  either  CDE  Level 
1-Low,  2-Low,  3-Low,  4-Low,  5-Low  or  5-High.  Within  CDM,  weapons  (and  their 
method  of  employment)  have  assigned  circular  errors  probable  (CER)  which  give 


59 


“a  radius  representing  the  largest  collateral  hazard  distance  for  a  given  warhead, 
weapon,  or  weapon  class  considering  predetermined,  acceptable  collateral  damage 
thresholds  that  are  established  for  each  CDE  level.”  From  the  CER,  a  collateral 
hazard  area  (CHA)  is  typically  created  by  rotating  this  radius  around  the  aimpoint 
of  a  weapon  to  create  a  circle.  Collateral  objects  falling  within  that  CHA  for  a 
given  CDE  level  cause  the  target  to  be  elevated  into  the  next  higher  CDE  level  for 
further  evaluation,  until  finally  a  CHA  is  created  with  no  collateral  objects  within 
its  boundaries,  or  the  rating  of  CDE  level  5-High  is  given  to  the  target. 

The  CERs  for  given  weapons  and  methods  of  employment  only  spell  out  the 
radius  outside  of  which  a  collateral  object  should  be  safe  from  the  weapon’s  firepower. 
This  approach  to  the  estimation  of  a  weapon’s  power  is  known  as  the  “cookie-cutter” 
approach,  whereby  all  objects  falling  within  the  radius  are  considered  to  be  destroyed 
and  all  targets  falling  outside  of  the  radius  are  100%  safe.  While  simple  and  easy 
to  implement,  this  assumption  can  be  detrimental  in  the  planning  process  for  a 
weapons  strike.  The  distribution,  and  not  just  the  lethal  range,  of  the  weapon  has  a 
significant  effect  on  the  choice  of  an  optimal  aiming  location  and  weapon  selection. 
This  paper  seeks  to  quantify  the  effects  of  different  weapon  damage  functions  along 
with  the  effect  of  improvements  made  to  the  current  policy  guiding  collateral  damage 
mitigation. 

3.2  Background 

3.2.1  Collateral  Damage  Estimation.  In  the  literature  the  probability  of 
destroying  a  point  target  is  calculated  with  the  following  formula  [64]  [82]: 

P=  [  f  p(x,y)  ■  d(x,y)dydx  (10) 


where 

P  -  probability  that  the  point  target  is  destroyed, 

p(x,y )  -  probability  density  function  of  the  weapon’s  impact  point, 
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d(x,  y )  -  probability  that  the  point  target  is  destroyed  given  that  the  weapon  impacts 
at  point  (. x,y ). 


In  Equation  10  we  indicate  that  lethality  at  a  particular  point  (x,y)  in  space 
is  both  a  factor  of  the  uncertainty  of  the  landing  location  (p(x,y))  and  the  damage 
caused  at  a  given  point  in  the  case  that  it  lands  at  a  particular  point  in  space  (d(x,  y)). 
Therefore,  collateral  damage  estimations  can  be  made  by  knowing  the  location  error 
function  and  the  damage  function  for  the  particular  weapon. 

The  most  commonly  used  location  error  formula  for  an  air-to-ground  weapon 
is  [64]: 

_  1  r~(l/2  a2)((x-fj,x)2+(y-Hv)2) 

2na'2 


where 

Hx  -  x-coordinate  of  aimpoint, 

Hy  -  y-coordinate  of  aimpoint, 

a  -  standard  deviation  of  the  miss  distance  for  the  weapon. 


This  formula  is  the  bivariate  normal  distribution,  where  the  x  and  y  miss 
distances  are  both  uncorrelated  (p  =  0)  and  the  distributions  in  both  the  x  and  y 
directions  are  identical  (that  is  a  =  ax  =  ay )  [29].  To  account  for  situations  where 
the  miss  distances  in  the  x  and  y  directions  are  different  (ax  ^  ay),  yet  uncorrelated, 
one  can  use  [34]: 


p(x,y) 


1  \(x—nx)2  ■  (y—^y)2 1 

_ p  2<T;e  2er  y 

2^2 


(12) 


For  uncorrelated  miss  distances  in  the  x  and  y  directions  (p  ^  0)  we  must  use 
a  more  complicated  formula  [32]: 


p(x,y ) 


1 

2 ITCTxVyV1  ~  P2 


(i-fa)2 
2cr  x 


2p(x-px)(y-Hy) 

crx&y 


( y-Vy )2 

2tTy 


1/2(1  -p)2 


(13) 
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A  variety  of  damage  functions  are  used  for  a  variety  of  reasons,  such  as  simplic¬ 
ity,  accuracy  in  modeling  the  data,  and  ease  of  computation.  Additionally,  different 
types  of  air-to-ground  weapons  will  have  differing  patterns  of  damage.  The  uniting 
characteristics  of  these  functions  is  that  they  are  decreasing  functions  as  the  radius 
(distance  from  point  of  impact)  increases,  their  integral  J0°°  d(r)dr  is  bounded,  and 
they  are  “well-behaved”  [64],  meaning  that  either  there  exists  a  radius  R  such  that 
for  all  r  >  R ,  d(r)  =  0,  or  their  function  is  continuous  and  monotonic. 


Cookie-Cutter: 

di{r) 

=  ’ 

J-J  JL  l 

(14) 

[0,  r 

>  LR 

Gaussian: 

d2(r) 

=  e-r2/2  b2 

(15) 

Exponential: 

d3{r) 

=  er/b2 

ln(r/a) 
erj  — 7= —  1 

L  V2P 

(16) 

Lognormal: 

dA{r) 

=  0.5{1  - 

(17) 

The  continuous  damage  functions  come  from  1-CDF  of  the  probability  func- 

J  b2e~b2r ,  r  >  0 

tions  (e.g.  the  exponential  distribution  which  has  a  PDF  of  f(r,  b2)  =  < 

jo,  r  <  0 


{1  -  e~b2r,  r  >  0 

,  which  in  turn  creates  the  damage 

0,  r  <  0 

function  ds(r)  =  1  —  (1  —  e~r^b2)  =  e~r^b2.)  Typically,  the  lethal  range  of  a  weapon 
is  calculated  as  J0°°  d(r) dr  [64],  thus  in  order  to  get  a  realistic  comparison  between 
damage  functions,  we  must  ensure  that  the  lethal  ranges  using  each  of  the  damage 
functions  is  the  same,  therefore  making  it  necessary  to  tweak  the  constants  in  the 
functions.  For  the  exponential  damage  function  b2  is  exactly  equal  to  the  lethal 
range  of  the  weapon  since  /0°°  d3{r)dr  =  b2.  In  the  Gaussian  damage  formula  the 
value  for  b\  can  be  shown  to  equal  LR  x  y/2/ir.  As  noted  in  [82],  since  the  lognormal 
damage  formula  has  two  inputs  ( a ,  f3 )  there  is  no  unique  setting  for  the  inputs  to 
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give  a  certain  lethal  range.  For  example,  in  Figure  27,  the  settings  for  the  lognormal 
damage  function  are  a  =  0.615  and  /3  =  1,  which  can  be  found  using  the  graph  in 
Figure  27  for  LR  =  1.  The  PDF  for  each  of  the  damage  functions  with  a  lethal  range 
of  1  are  depicted  in  Figure  28. 


Figure  27:  Lognormal  Damage  Function  Inputs  for  Desired  Lethal  Range 


Figure  28:  Damage  Functions  (LR=1) 
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3.2.2  Lethality  Functions.  Combining  the  damage  function  with  the  loca¬ 
tion  error  function  (with  the  assumption  of  p  =  0  and  ax  =  ay)  yields  the  following 
lethality  function  which  is  the  expected  amount  of  damage  at  a  given  distance  from 
the  aim-point  (x1 ,  y'): 


p(x,y,x',y')d(x,y)dydx 


X  JY 


(18) 


By  converting  (x,y)  into  a  distance  r  from  (x',y'),  we  can  then  inspect  the 
shape  of  the  lethality  functions  based  on  the  different  damage  functions  d(x,y). 
In  the  case  where  the  standard  deviations  are  1  unit  and  the  lethal  range  of  the 
weapon  is  1  and  5  units,  respectively,  we  generate  the  following  graphs  of  the  lethality 
functions: 


Figure  29:  Lethality  Functions  (LR=1) 
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Figure  30:  Lethality  Functions  (LR=5) 

Figures  29  and  30  give  insight  as  to  which  functions  over-  or  under-estimate 
lethality  and  at  which  ranges.  For  example,  in  both  graphs  the  cookie-cutter  ap¬ 
proach  gives  a  higher  result  at  the  aimpoint  (r  =  0)  than  the  other  approaches. 
This  is  particularly  pronounced  in  Figure  30  where  the  cookie-cutter  approach  yields 
lethality  almost  75%  higher  than  the  lognormal  and  exponential  damage  functions  at 
a  distance  from  the  aimpoint  of  3.  This  phenomenon  will  always  be  most  pronounced 
when  the  lethal  range  is  high  relative  to  the  standard  deviation  of  the  miss  distance 
of  the  weapon  (in  these  two  examples,  the  standard  deviation  of  the  miss  distance 
for  the  weapon  is  1).  In  the  extreme  case  where  the  accuracy  is  degraded  (yielding  a 
high  standard  deviation)  then  the  cookie-cutter  approach  will  yield  a  smaller  value 
at  the  aimpoint  than  the  other  approaches. 

Lucas  [64]  goes  into  detail  about  the  limiting  behavior  of  each  of  the  damage 
functions,  noting  that  the  lethality  of  the  cookie-cutter  function  drops  off  the  fastest 
when  at  higher  distances  from  the  aimpoint.  This  fact  could  be  surmised  from  the 
very  quick  drop  of  the  cookie-cutter  approach  in  Figure  29  and  particularly  in  Fig¬ 
ure  30.  Further,  Lucas  [64]  proves  that  the  lognormal  damage  function  has  higher 
lethality  in  its  tail  as  r  goes  to  infinity  than  any  of  the  other  damage  functions,  fol- 
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lowed  by  the  exponential,  then  the  Gaussian,  and  finally  the  cookie-cutter  approach, 
which  has  the  lowest  lethality  in  its  tail.  These  insights  about  the  limiting  behavior 
of  the  lethality  functions  are  irrespective  of  the  accuracy  of  the  weapon. 

3.2.3  Offset  Aiming.  The  concept  of  offset  aiming  is  integral  to  the  dis¬ 
cussion  of  the  minimization  of  collateral  damage.  Offset  aiming  is  the  concept  that 
directly  targeting  military  objects  is  not  always  optimal.  For  example,  if  a  mili¬ 
tary  target  is  located  directly  west  of  a  collateral  object  (say,  a  school)  and  a  given 
weapon  striking  the  military  target  directly  with  the  chosen  weapon  would  carry 
enough  force  to  significantly  damage  the  school,  then  perhaps  aiming  slightly  to  the 
west  of  the  military  object  would  be  optimal.  The  weapon  and  its  lethality  function 
might  indicate  that  aiming  10  meters  to  the  west  of  the  military  object  would  still 
employ  enough  firepower  to  accomplish  the  military  objectives  while  the  extra  ten 
meters  would  put  the  school  outside  of  the  lethal  range  of  the  weapon,  thus  lowering 
any  negative  effects  on  the  school. 

Offset  aiming  is  already  part  of  the  Department  of  Defense’s  (DoD)  official 
policy  on  collateral  damage  estimation  and  mitigation.  However,  offset  aiming  is  not 
considered  until  later  levels  of  the  CDE  guidance.  There  is  an  argument  to  be  made 
that  offset  aiming  should  be  considered  at  all  levels  of  the  CDE  process. 

3.2.4  Weapons  Employment  as  a  Multi- Objective  Problem.  The  current 
DoD  policy  on  collateral  damage  indicates  that  collateral  damage  estimation  must 
be  performed  prior  to  any  pre-planned  air-to-ground  strike.  A  commander  must  be 
made  aware  of  collateral  risk  in  the  area  surrounding  the  military  target  and  be 
provided  with  detailed  collateral  damage  estimation  before  giving  orders  to  strike. 
Efforts  must  be  made  to  avoid  collateral  damage  at  a  high  cost,  and  the  commander 
then  decides  if  the  amount  of  collateral  risk  is  worth  the  military  value  of  striking 
that  target.  Within  CDE,  differing  weapons,  aimpoints  and  methods  of  employment 
are  considered  in  an  attempt  to  satisfy  military  objectives  in  the  face  of  collateral 
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risk.  The  two  concepts,  military  objectives  and  collateral  risk,  are  played  off  against 
each  other  in  order  to  create  a  mission  plan  that  the  commander  is  willing  to  support. 
La  Rock  [57]  discusses  a  multi-objective  approach  towards  weapon  implementation 
taking  into  consideration  collateral  effects. 

Typically,  as  the  mission  plan  seeks  to  lower  the  collateral  risk,  the  lethality  on 
the  military  target  suffers.  The  converse  is  also  true,  as  the  lethality  on  the  target 
is  sought  to  be  increased,  the  risk  to  collateral  objects  in  the  area  also  increases. 
The  lethality  functions  mentioned  in  the  previous  section  are  the  same  for  both 
collateral  objects  and  military  targets;  however,  the  goal  is  to  minimize  the  lethality 
on  collateral  objects  and  maximize  the  lethality  on  military  targets. 

3.3  Formulation 

For  a  given  damage  function  d(x,  y )  and  a  known  delivery  error  function  p(x,  y ), 
we  may  begin  to  characterize  the  multi-objective  function  we  seek  to  optimize  for  a 
given  scenario. 

Goal:  Max  fi(x,y)  (lethality  on  the  military  target)  (19) 

Min  f2(x,y)  (lethality  on  collateral  objects) 


where 

fi(x,y)  =  fx  fYp(x,y,x',y')d(x,y)dydx 

f-2(x,y)  =  EIUg/x  fYp(x,y,Xi,yi)d(x,y)dydx 
p(^x  y  Xi  y ^)  =  — L_g  (i/2cr  )((^  ) 

(x',y')  -  location  of  the  military  target 

d(x,y)  -  damage  function  for  weapon 

n  -  number  of  collateral  objects  in  the  area  of  concern 

(. Xi,yi )  -  location  of  the  ith  collateral  object 

Ci  -  value  of  the  ith  collateral  object 
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a  -  standard  deviation  of  weapon  miss  distance 

Damage  function  specific  inputs 

a,/3  -  lognormal  damage  function 

b\  -  exponential  damage  function 

&2  -  Gaussian  damage  function 

LR  -  cookie-cutter  damage  function 

The  values  of  the  collateral  objects  are  typically  subjective  based  on  the  desire 
to  avoid  striking  them.  The  higher  the  value  the  more  concern  the  planners  have  for 
avoiding  this  structure/area.  The  inputs  to  the  lethality  function  are  the  lethal  range 
of  the  weapon,  the  damage  function  to  be  used  (along  with  choices  of  either  alpha 
or  beta  for  the  lognormal  function),  and  the  accuracy  of  the  weapon  (the  standard 
deviation  of  the  miss  distance). 

3.3.1  Goal  Programming  Formulation.  Once  offset  aiming  is  introduced 
to  achieve  collateral  damage  mitigation,  a  goal-programming  approach  can  then 
be  employed  to  get  an  accurate  comparison  between  the  damage  functions’  effect 
on  both  lethality  and  collateral  damage.  For  instance,  we  could  stipulate  that  the 
lethality  on  the  military  target  must  be  above  a  certain  number,  say  90%.  If  this  were 
our  assumption,  then  our  secondary  goal  would  be  to  then  find  the  point  that  satisfies 
this  requirement  while  trying  to  minimize  the  collateral  damage.  We  will  call  this 
approach  the  lethality  first  approach,  or  in  this  case  the  90%  lethality  first  approach. 
Conversely,  if  we  wanted  to  use  a  constraint  of  no  more  than  10%  collateral  risk, 
we  would  start  by  eliminating  all  aimpoints  that  don’t  first  satisfy  this  constraint. 
From  there,  we  would  then  search  the  space  that  maximizes  lethality  on  the  military 
target;  this  will  be  the  collateral  first  approach. 

If  we  can  turn  either  of  the  two  objective  functions  into  a  constraint,  then  this 
problem  is  simply  a  single  objective  non-linear  objective  problem  with  an  additional 


constraint  from  the  other  objective.  There  will  now  be  just  a  single  solution  (in  our 
case,  a  single  point  (x,  y)  in  the  scenario)  which  optimizes  our  objective. 

The  new  formulation  for  the  collateral  first  approach  becomes: 

Goal:  Max  f\  (x,  y)  (20) 

subject  to:  f2(x,y)  <  c 

For  the  lethality  first  approach  the  formulation  becomes: 

Goal:  Min  f2(x,y)  (21) 

subject  to:  fi(x,y)  >  c 

with  the  same  constraints  and  definitions  from  Equation  20.  In  cases  where  the  miss 
distance  standard  deviations  in  the  x  and  y  directions  are  the  same,  then  f\  will 
have  the  same  value  for  all  points  which  are  the  same  distance  from  the  location  of 
the  military  target.  Therefore,  fi(x,y)  can  be  thought  of  as  fi(r)  where  r  is  the 
distance  from  the  point  ( x,y )  to  the  location  of  the  military  target  ( x',y That  is, 
/i  is  symmetric  about  the  location  ( x',y '). 

Solution  techniques  for  solving  non-linear  programs  such  as  response  surface 
methodology  and  evolutionary  algorithms  are  logical  candidates.  To  further  charac¬ 
terize  our  objective  function,  we  observe  that  it  is  continuous,  since  each  of  the  lethal¬ 
ity  functions  are  continuous  (even  the  cookie-cutter  lethality  function).  Therefore, 
non-linear  optimization  techniques  which  rely  upon  a  continuous  function  should  be 
tried,  while  techniques  which  are  more  suitable  for  discontinuous  functions  are  less 
logical  (such  as  tabu  search,  branch-and-bound,  etc.) 

3.3.2  Weighted  Sum  Scalarization.  In  a  similar  vein,  to  convert  multi¬ 
objective  optimization  problems  into  single  objective  optimization  problems,  weights 
can  be  given  to  the  multiple  objective  functions.  In  this  case,  the  sum  of  the  weighted 
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function  values  is  calculated  in  an  effort  to  minimize  (or  maximize)  the  total.  In  our 
problem,  since  we  seek  to  maximize  lethality  (/i)  and  minimize  collateral  risk  (/2), 
then  opposite  signs  are  given  to  the  two  functions: 

Goal:  Max  vjuf\  (x,  y )  -  w2f2(x,  y)  (22) 

subject  to:  W\  +  w2  =  1 

The  weighted  sum  scalarization  approach  presents  a  decision  analysis  problem 
since  now  we  must  construct  weights  for  the  value  of  collateral  objects  relative  to 
the  value  of  increased  lethality  on  the  military  target.  Of  important  note  is  that 
any  solution  to  the  weighted  sum  scalarization  approach  or  any  solution  to  the  goal¬ 
programming  approach  will  be  a  point  on  the  Pareto  front  for  the  problem. 

3.3.3  Multi- Objective  Formulation.  If  we  choose  not  to  employ  either 
goal-programming  or  weighted  sum  scalarization  as  a  technique  to  combine  the  two 
objective  functions  into  a  single  objective  function  (or  a  single  objective  function 
with  an  added  constraint),  then  we  can  use  a  multi-objective  optimization  approach. 
When  there  is  more  than  one  objective  function,  there  are  (often)  infinitely  many 
solutions  that  he  on  the  Pareto  front  for  the  particular  problem.  Assuming  that  all 
objective  functions  attempt  to  minimize  the  response,  the  Pareto  front  is  the  set  of 
solutions  (x',y')  such  that  there  exists  no  other  solution  (x,  y)  for  which  fk(x',y')  < 
fk(x,y)  for  all  k  from  1  to  the  number  of  objective  functions  and  fk(x',y')  <  fk(x,y ) 
for  at  least  one  value  of  k.  These  points  in  the  Pareto  front  are  non-dominated  by 
any  other  solution. 

Algorithms  to  identify  the  entirety  of  the  Pareto  front  are  difficult  to  find, 
especially  for  problems  where  the  objective  functions  are  complex,  such  as  in  the 
collateral  damage  airstrike  minimization  problem.  The  majority  of  the  literature  on 
solving  multi-objective  optimization  problems  depends  on  evolutionary  algorithms 
[19].  For  instance,  differential  evolution  approaches  [2]  [107]  [104]  [9]  use  mutation 


70 


and  combination  methods  developed  for  a  single  objective  algorithm  by  Storn  and 
Price  [94],  We  will  compare  the  differential  evolution  algorithm  of  [2]  against  the 
algorithm  we  created  to  solve  our  multi- objective  problem  in  the  following  section. 

3-4  Methodology 

In  this  section  we  will  compare  a  differential  evolution  algorithm  to  a  radius- 
based  search  method  that  leverages  off  the  nature  of  the  airstrike  multi-objective 
formulation.  The  radius-based  search  method  is  shown  to  run  in  a  fraction  of  the 
time  of  a  differential  evolution  algorithm  and  produce  better  results.  This  radius- 
based  search  algorithm  relies  on  the  fact  that  with  only  a  single  military  target  in 
the  region  of  interest,  we  can  express  the  lethality  function  f\  in  terms  of  only  the 
distance  from  the  target.  Thus,  any  point  that  is  Pareto  optimal  must  have  the 
lowest  collateral  risk  for  all  points  the  same  distance  from  the  target.  The  converse 
is  not  true;  that  is,  a  point  which  has  the  lowest  collateral  risk  for  all  points  the  same 
distance  from  the  target  is  not  guaranteed  to  be  Pareto  optimal. 

Further,  while  not  guaranteed  that  all  Pareto  optimal  solutions  lay  on  a  con¬ 
tinuous  line,  in  practice,  we  find  that  this  is  the  case  in  nearly  all  scenarios.  Since  we 
are  guaranteed  the  strictly  decreasing  nature  of  the  lethality  functions  (regardless 
of  their  underlying  damage  functions),  the  location  of  the  target  is  a  Pareto  opti¬ 
mal  solution.  Therefore,  the  continuous  line  emanates  from  the  target  location  and 
extends  to  the  edge  of  the  scene.  This  is  consistent  with  the  graphs  of  the  Pareto 
optimal  solutions  in  the  figures  earlier  in  this  section. 

Figure  31  shows  our  radius-based  solution  algorithm  which  accurately  estimates 
the  Pareto  front  and  Pareto  optimal  solutions. 
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mimadius  =  min  distance  from  target  location  to  scene  edge 
optimal_angle(0)  =  0 

“  odetermine  optimal  angle  for  every  radius 
for  minradius__percentage  =  1  to  100 
current_angle  =optimal_angle(minradius_percentage  -1) 

do  until  coll_risk(current_angle.  minradius_percentage)  <  min(coll_risk(current_angle 
-  1,  minradius_percentage).  coll_risk(current_angle  + 1.  minradius_percentage)) 

current_angle  =current_angle  ±  1 

loop 

optimal_angle{mittradius_percentage)  =  current_angle 
nest  minradius_percentage 
•'oeliminate  dominated  solutions 
for  minradius_percentage  =  100  to  1 

if  coll_risk(optimal_angle.  minradius_percentage)  >coll_risk(optimal_angle, 
mimadius_percentage-l) 
optimal_angle(minradius_percentage)  =  BLANK 

end 

nest  minradius_percentage 

nondominatedsolutions  =  0 
for  minradius_percentage  =  0  to  100 
if  optimal_angle(minradius_percentage)  O  BLANK 
nondominatedsolutions  =  nondominatedsolutions  +  1 
paretosolutions(nondominatedsolutions.  1)  =  minradius_percentage 
paretosolutions(nondominatedsolutions,  2)  =  optimal_angle(minradius_percentage) 
end 

nest  minradius_percentage 


Figure  31:  Radius-Based  Search  Algorithm 


3-4- 1  Prototype  Problem.  Using  the  formulation  laid  out  in  the  previous 
section,  we  may  now  begin  to  picture  what  the  objective  functions  look  like  over 
the  range  of  possible  solutions.  For  the  examples  in  this  section,  we  will  use  the 
assumption  that  the  scene  is  a  100  meter  by  100  meter  square.  There  is  a  single 
military  target  in  each  scene  along  with  a  number  of  equally  weighted  collateral 
objects  that  we  seek  to  avoid  damaging.  With  a  miss  distance  standard  deviation 
and  a  lethal  range  provided  for  the  weapon  used,  contour  plots  can  be  developed  for 
each  of  the  damage  functions.  From  these  contour  plots,  it  is  very  easy  to  identify 
where  both  the  military  targets  and  collateral  objects  are  located  within  the  scene 
(each  of  the  contour  plots  is  based  on  the  same  scenario).  Figures  32  and  33  show 
the  lethality  functions  and  the  collateral  objects  for  our  prototype  problem. 
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Figure  32:  Lethality  Function 
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Figure  33:  Location  of  Collateral  Objects 


To  help  visualize  the  nature  of  solutions,  Figures  34  -  37  plot  the  two  objective 
functions  for  the  prototype  problem  using  the  four  different  damage  functions  (the 
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lethality  function  is  on  the  left  and  the  collateral  risk  function  is  on  the  right). 
The  cookie-cutter  damage  function  (in  Figure  36)  shows  the  fastest  drop-off  of  any 
of  the  damage  functions  yielding  more  distinct  humps  in  the  collateral  risk  graph 
surrounding  the  twenty  collateral  objects.  Contrast  this  with  the  exponential  damage 
function  collateral  risk  graph  (Figure  35)  where  the  humps  surrounding  the  collateral 
objects  are  much  more  blurred  as  a  result  of  a  more  gradual  decline  in  the  lethality 
function  for  the  exponential.  The  lognormal  and  Gaussian  damage  function  graphs 
fall  between  the  two  extremes  of  the  exponential  and  cookie-cutter  graphs. 


Figure  34:  Gaussian  Damage  Function 


Figure  35:  Exponential  Damage  Function 
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Figure  36:  Cookie-Cutter  Damage  Function 


Figure  37:  Lognormal  Damage  Function 

Using  enumeration  (finding  the  lethality  and  collateral  risk  at  a  very  fine  resolu¬ 
tion,  1000x1000,  in  the  solution  space),  we  find  the  non-dominated  points.  Figures  38 
-  41  show  the  objective  function  values  for  all  points  in  the  scene  (sampled  every  me¬ 
ter  in  the  x-  and  y-directions)  in  the  leftmost  graphs  with  the  higher  lethality  on  the 
left  edge  and  the  lower  collateral  risk  on  the  lower  edge.  The  middle  graphs  show  the 
non-dominated  (Pareto-optimal)  points  in  the  objective  space,  which  are  those  points 
on  the  “southwest”  edge  of  the  left  plots.  These  points  are  non-dominated  since  no 
other  point  in  this  space  has  both  a  lethality  value  higher  and  a  collateral  value  lower 
than  these  points.  The  right  graphs  show  the  location  of  the  non-dominated  points 
in  the  solution  space. 
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Figure  38:  Gaussian  Damage  Function 


Figure  39:  Exponential  Damage  Function 


Figure  41:  Lognormal  Damage  Function 
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One  may  not  be  surprised  to  see  that  the  location  of  the  military  target  is 
among  the  Pareto  optimal  points  since  no  other  point  would  have  a  higher  lethal¬ 
ity  value  than  this  point  (due  to  the  decreasing  nature  of  the  functions  in  Fig¬ 
ure  29  and  30.)  Keeping  in  mind  that  these  functions  were  all  tested  with  the 
same  lethal  range,  accuracy  and  scenario,  the  cookie-cutter  function  estimates  much 
higher  lethality  on  the  target  and  much  lower  collateral  risk  than  the  other  dam¬ 
age  functions.  The  Gaussian  function  (Figure  38)  demonstrates  significantly  higher 
lethality  on  the  target  than  the  exponential  and  lognormal  damage  functions,  but 
the  collateral  risk  among  these  three  damage  functions  is  fairly  comparable. 

In  the  next  section,  we  undertake  the  comparison  of  the  algorithms  discussed 
earlier  and  comparison  metrics  are  introduced.  The  goal  is  to  quickly  and  accurately 
locate  the  Pareto  optimal  solutions. 

3-4-2  Algorithm  Performance.  Zitzler  et  al.  [106]  present  methods  for 
judging  the  effectiveness  of  algorithms  for  finding  the  Pareto  front  in  multi-objective 
optimization  problems.  The  first  metric  (accuracy)  measures  the  minimum  distance 
from  the  found  solutions  to  a  point  on  the  true  Pareto  front  (lower  is  better).  The  set 
of  solutions  found  in  the  objective  space  are  Y'  and  the  true  Pareto  optimal  frontier 
in  the  objective  space  is  Y: 

M\{Y')  :=  wyx  niin{||  a  -a\\;  a  eY}.  (23) 

'  '  a’&Y' 

The  second  metric  (diversity)  measures  the  number  of  solutions  found  within  a 
distance  of  a  from  each  found  solution.  This  indicates  how  distinct  found  solutions 
are  from  one  another  (lower  is  better): 

M2(Y')  :=  iyT-j  XI  |{f>'  e  V;  II  o'  -  b'  ||<  a}\,  (24) 

I  I  a'&Y' 
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The  final  metric  (breadth)  gives  the  maximum  distance  between  found  solu¬ 
tions  for  each  coordinate.  As  noted  in  [18],  for  a  two  dimensional  problem  such  as 
the  collateral  risk  problem,  this  equals  the  distance  of  the  two  outer  solutions  (higher 
is  better): 

m 

M 3(y')  :=  max{ll  <  -  b'i  II;  a',  b'  e  Y'}-  (25) 

N i=i 


Using  these  metrics,  we  compare  the  radius-based  algorithm  to  enumeration 
and  a  differential  evolution  approach  detailed  in  Figure  42. 
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Figure  42:  Algorithm  Performance  (Prototype  Problem) 


Figure  42  compares  the  performance  of  our  radius  based  search,  a  differential 
evolution  algorithm,  and  enumeration  (best  results  in  bold).  The  radius  based  al¬ 
gorithm  takes  37  minutes  on  a  2.60GHz  machine  to  run,  whereas  the  differential 
evolution  approach  takes  almost  four  hours  and  the  enumeration  approach  takes 
more  than  a  week  to  compute.  Our  algorithm  demonstrates  the  ability  to  generate 
points  close  to  the  true  Pareto  front  for  the  prototype  problem  for  all  four  damage 
functions  as  seen  by  the  low  values  for  metric  1.  Metric  2,  which  measures  the  amount 
of  points  in  the  objective  space  within  0.001  of  each  other  yields  mixed  results,  with 
our  algorithm  generating  the  lowest  totals  for  two  of  the  four  damage  functions. 
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Further,  small  changes  to  the  algorithm,  such  as  using  a  logarithmic  growth  of  the 
percentage  value  rather  than  linear  growth,  improve  results  for  radius-based  search 
for  this  metric.  Metric  3  shows  comparable  results  for  the  algorithms;  the  results  are 
routinely  within  5%  of  each  other  for  a  given  damage  function. 

Figure  43  summarizes  the  metrics  for  100  trials  using  our  algorithm  demon¬ 
strating  performance  of  the  algorithm  against  a  variety  of  scenarios  (accuracy  ranging 
from  1  to  5  meters,  lethal  range  varying  from  5  to  20  meters,  the  number  of  collateral 
objects  between  20  and  30,  locations  of  the  collateral  objects  and  military  targets 
varying  within  the  100m  x  100m  scene).  These  results  confirm  our  algorithm’s  per¬ 
formance  as  seen  in  the  prototype  problem. 
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Figure  43:  Metrics  across  100  Radius-Based  Trials 

Now  that  we  are  able  to  quickly  and  accurately  find  a  broad  range  of  Pareto 
optimal  solutions  for  the  true  multi-objective  formulation,  finding  solutions  to  both 
the  goal  programming  and  weighted  sum  scalarization  formulation  becomes  straight¬ 
forward.  Recall  that  solutions  to  weighted  sum  and  goal  programming  are  a  subset 
of  the  Pareto  optimal  solutions.  Therefore,  we  must  search  only  these  solutions  in 
order  to  find  optimal  solutions  to  the  other  formulations.  The  radius-based  search 
gives  us  the  location  (x,y)  and  objective  function  values  (/i,/2)  for  the  optimal 
solutions;  therefore,  testing  just  these  to  find  an  optimal  aimpoint  is  as  simple  as 
searching  from  among  a  small  group  of  high  quality  solutions  for  the  best  values. 

As  long  as  the  weights  of  the  collateral  objects  are  set  to  1,  the  formulation 
becomes: 

Goal:  Max  vjuf\  (x,  y)  -  w2f2(x,  y)  (26) 
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For  example,  using  the  Gaussian  damage  function,  if  we  choose  to  value  lethal¬ 
ity  (/i)  three  times  as  much  as  collateral  risk  (/2),  then  w\  —  \  and  W2  =  Searching 
from  our  Pareto  optimal  solutions  yields  a  best  solution  of  0.1658  (/j  =  0.69,  /2  = 
1.41)  located  at  (29.4,  72.6).  This  solution  is  found  without  having  to  reevaluate 
the  objective  function.  In  a  similar  manner,  assume  we  used  a  goal  programming 
approach  where  we  seek  the  lowest  collateral  risk  while  having  at  least  50%  lethality 
on  the  military  target  for  the  same  scene.  We  only  need  to  find  the  Pareto  optimal 
point  that  is  the  lowest  lethality  above  0.50,  and  that  is  our  solution  to  this  goal¬ 
programming  problem.  This  point  is  located  at  (27.7,  77.6)  with  a  collateral  risk  of 
1.04  using  the  Gaussian  damage  function. 

In  the  next  section  we  will  create  Pareto  front  solutions  for  our  airstrike  problem 
using  our  radius  based  algorithm.  In  particular  we  explore  optimal  solutions  based 
on  different  scenarios,  guidelines  and  damage  function  in  following  sections. 

3.5  Results 

We  first  give  a  visual  depiction  of  the  effects  of  differing  the  damage  function 
and  approach  on  the  location  of  the  optimal  solution.  The  scenario  is  the  same  as 
the  one  depicted  in  Section  3.4.1,  with  20  collateral  objects  in  a  100m  x  100m  scene. 
In  the  Figure  44  we  zoom  in  on  the  area  around  the  military  target  (located  at  (30, 
70)).  The  lethal  range  of  the  weapon  is  10  meters  and  the  accuracy  is  5  meters. 
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Figure  44:  Location  of  Optimal  Solutions 

The  effect  of  the  two  closest  collateral  objects  (at  points  (32.4,  65.1)  and  (19.6, 
68.1))  can  be  seen  in  Figure  44  as  offsetting  the  optimal  locations  north  of  the  military 
target  for  all  guidelines  and  damage  functions.  As  the  collateral  constraint  increases 
it  pushes  the  aimpoint  away  from  the  target  and,  similarly,  as  the  lethality  constraint 
increases,  the  closer  the  optimal  aimpoint  becomes  to  the  target  (in  Figure  44,  we 
show  a  lethality  first  constraint  of  50%  and  a  collateral  first  constraint  of  50%  for 
comparison) . 

3.5.1  Effects  of  Differing  Damage  Functions.  With  the  differing  shape  due 
to  the  damage  functions  seen  in  Figure  28,  the  result  using  a  zero  offset  distance  with 
a  lethal  range  between  5  and  10  meters  along  with  a  standard  deviation  of  the  miss 
distance  of  5  meters  and  randomly  generated  collateral  objects,  we  see  the  following 
results  for  100  trials: 
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Damage  Function 

Lethality 

Collateral 

Damage 

It  with 
highest 
lethality 

It  with 

lowest 

collateral 

Gaussian 

0.5733 

0.7541 

0 

0 

Exponential 

0.4550 

0.7102 

0 

4 

Cookie -Cutter 

0.6724 

0.3788 

100 

96 

Lognormal 

0.4409 

0.8427 

0 

0 

Figure  45:  Comparison  of  Damage  Functions 


The  cookie-cutter  damage  function  which  is  used  in  CDE  overstates  lethality  by 
37%  compared  to  the  average  of  the  other  three  damage  functions  and  it  understates 
collateral  risk  by  51%.  In  the  next  section,  we  present  a  more  general  result. 

3.5.2  Theoretical  Collateral  Function  Values  for  Differing  Damage  Functions. 

In  a  space  where  collateral  objects  are  randomly  located  throughout  an  infinite 
space,  we  are  able  to  calculate  the  theoretical  collateral  function  values  for  the  dif¬ 
ferent  damage  functions  if  we  know  the  number  of  objects  per  unit  of  planar  space. 
By  rotating  the  lethality  function  around  the  x-axis  and  multiplying  by  the  num¬ 
ber  of  targets  per  square  meter  (n),  we  would  obtain  expected  collateral  value  for  a 
randomly  generated  scene: 


E[f2(r)\n]  =  27 m  /  rd{r)dr  (27) 

Jo 

When  looking  at  the  theoretical  value  for  each  damage  function,  we  can  see 
how  the  damage  functions  will  give  wildly  different  collateral  risk  values.  E[f2cc]  < 
E[f2, J  <  E[f2e. ]  regardless  of  the  accuracy  and  lethal  range  of  the  weapon  in  a 
randomly  generated  infinitely  large  scene. 

That  is,  the  expected  collateral  risk  for  the  cookie-cutter  damage  function  will 
always  be  less  than  the  Gaussian  damage  function  which  will  be  less  than  using  the 
exponential  damage  function  (proof  in  Appendix  A).  The  collateral  risks  for  varying 
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lethal  ranges  are  shown  in  Figure  46,  where  the  theorem  is  demonstrated  for  the 
varying  damage  functions. 


Figure  46:  Collateral  Risk  by  Damage  Function 


The  values  in  Figure  47  summarize  the  collateral  damage  for  twenty  collateral 
items  within  a  100m  x  100m  square  along  with  the  lethality  at  the  aimpoint  for  the 
various  damage  functions. 
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0.389 
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1.000 
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0.326 
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0.416 
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g 

0.985 
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0.718 

0.800 

0.389 

0.800 

lethal 

e 

0.884 

0.969 
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1.175 

0.344 

1.210 
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cc 

1.000 

0.628 

0.865 

0.628 

0.394 

0.628 

log 

0.940 

0.967 
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1.222 

0.325 

1.314 

20  meter 

g 

0.996 

2.933 

0.911 

3.178 

0.718 

3.180 

lethal 

e 

0.940 

3.353 

0.741 

3.595 

0.562 

3.631 
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cc 

1.000 

2.198 

1.000 

2.513 

0.865 

2.513 

log 

0.986 

3.134 

0.767 

3.358 

0.554 

3.397 

Figure  47:  Expected  Collateral  Damage 
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These  results  mirror  the  results  of  [64],  which  indicates  that  the  lognormal 
damage  function  gives  a  higher  collateral  risk  value  (and  thus  a  longer  stand-off 
range)  relative  to  the  other  damage  functions  and  conversely  the  cookie-cutter  func¬ 
tion  gives  a  lower  collateral  risk  than  the  other  damage  functions  and  therefore  a 
shorter  stand-off  range. 

3.5.3  Effects  of  Differing  Guidelines.  In  this  section  we  test  another  100 
randomly-generated  real-world  representative  problems.  Current  policy  allows  for 
no  offset  aiming,  whereas  the  two  other  guidelines  allow  for  offset  aiming  while  using 
either  collateral  risk  or  lethality  as  a  constraint,  as  seen  in  Equation  21  and  22.  We 
test  lethal  range  of  5  meters  and  accuracy  of  1  meter  using  the  cookie-cutter  damage 
function.  Further,  we  assume  that  the  lethality  must  be  at  least  0.8  for  lethality  Erst 
and  collateral  risk  must  be  no  more  than  0.2  for  collateral  first  (results  in  Figure  48). 


Damage  Function 

Letha  lity 

Col  lateral 

Damage 

Collateral  First  (0.2) 

0.742 

0.105 

Lethality  First  (0.8) 

0.823 

0.205 

Current  Methodology 

0.936 

0.290 

Figure  48:  Results  for  a  =  1,  LR  =  1 

The  baseline  methodology,  which  aims  directly  at  the  target,  yields  the  highest 
lethality.  However,  the  collateral  damage  is,  on  average,  176%  higher  than  for  the 
collateral  first  approach  and  41%  higher  for  the  lethality  first  approach,  with  the 
lethality  being  26%  and  14%  higher  versus  the  collateral  first  and  lethality  first 
approaches,  respectively. 
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Figure  49:  Location  of  Solutions 

Figure  49  shows  the  location  of  solutions.  The  lethality  first  approach  requires 
an  optimal  location  which  is  within  four  meters  of  the  target  location  in  order  to 
have  at  least  80%  lethality  on  the  target.  This  is  not  the  case  for  the  collateral  first 
approach,  which  can  result  in  solutions  very  far  from  the  military  target  in  order 
to  find  a  location  that  falls  below  the  20%  threshold  for  collateral  risk  (as  seen  in 
Figure  49  where  one  of  the  optimal  aimpoints  is  located  at  roughly  (-25,  -10)  a 
distance  of  almost  27  meters  from  the  target). 
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Figure  50:  Lethality  First  vs.  Collateral  First 

In  Figure  50,  we  compare  the  lethality  first  and  collateral  first  approaches 
(using  the  Gaussian  damage  function)  for  varying  levels,  with  the  collateral  con¬ 
straint  and  lethality  constraint  ranging  from  0.2  to  0.8  and  differing  lethal  ranges 
and  standard  deviations.  In  some  scenarios,  there  are  no  locations  which  will  yield 
a  collateral  risk  less  than  the  constraint  or  lethality  above  the  lethality  constraint. 
Thus,  feasibility  of  solutions  satisfying  both  goals  is  not  assured.  The  trade-off  also 
introduces  the  idea  of  ordnance  selection,  the  topic  of  the  next  section. 

3.6  Ordnance  Selection 

Airstrike  planners  may  have  a  variety  of  weapons  as  well  as  methods  of  em¬ 
ployment  (fusing,  run-in,  etc.)  which  affect  the  lethal  range  and  accuracy.  Thus, 
to  increase  lethality  and  decrease  collateral  risk  a  planner  must  not  only  look  for 
the  optimal  aim-point  but  also  the  best  selection  of  weapon  and  employment.  For 
instance,  assume  that  in  the  same  scenario  in  Figure  33,  the  planner  was  presented 
with  the  following  options  of  weapon  and  employment: 


Weapon 

Lethal 

Range 

Accu  ra  cy 

Weapon  1 

10.000 

2.000 

Wea  pon 2 

8.000 

1.800 

Weapon  3 

6.000 

1.600 

Wea  pon  4 

4.000 

1.400 

Figure  51:  Weapon/Employment  Parameters 


Let  us  assume  that  the  damage  function  follows  an  exponential  distribution. 
The  lethality/collateral  risk  trade-off  values  appear  in  Figure  52.  From  this  figure, 
we  see  the  smaller,  more  accurate  weapons  yield  a  slightly  lower  lethality,  but  with 
a  large  reduction  in  collateral  risk  (a  75%  reduction  in  collateral  risk  with  only  a 
10%  reduction  in  lethality  when  moving  from  Weapon  1  to  Weapon  4).  While  these 
numbers  will  vary  with  the  scenario  and  damage  function,  the  take-away  is  that 
smaller  weapons  provide  nearly  as  much  lethality  as  larger  weapons,  but  with  a 
greatly  reduced  amount  of  collateral  risk  as  long  as  the  accuracy  also  improves  with 
the  smaller  weapon. 


Figure  52:  Weapon  Lethality  and  Collateral  Risk 
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Moreover,  assume  that  the  mission  planners  would  like  to  have  90%  lethality 
on  a  military  target  while  minimizing  collateral  risk.  If  they  had  only  weapons  1  and 
4,  only  weapon  4  aimed  close  to  the  target  would  reach  this  goal  and  that  would  be 
at  a  significant  cost  (collateral  risk  around  1.5).  However,  if  they  could  fire  multiple 
weapons,  perhaps  they  would  choose  bring  two  weapon  4’s,  offsetting  the  weapons 
to  achieve  a  lethality  just  over  0.7  each  (since  0.32  ~  0.1  yielding  a  10%  likelihood 
of  not  destroying  the  target).  This  choice  would  yield  a  combined  collateral  risk  of 
under  0.4,  over  a  70%  decrease  in  collateral  risk  from  firing  weapon  1. 

Since  planners  will  often  be  confronted  with  relatively  few  choices  in  terms  of 
weapons  and  employment  options,  then  the  evaluation  of  the  collateral  risk,  lethality 
and  optimal  aimpoint  for  each  can  be  performed  in  an  enumerative  manner  (this  can 
be  accomplished  in  parallel  if  time  is  a  concern).  The  search  algorithms  can  be 
used  when  either  a  clear  goal  is  stated  (e.g  .  minimize  collateral  risk  while  ensuring 
90%  lethality  on  the  target)  or  when  a  variety  of  alternatives  is  available  to  the 
decision-maker.  It  is,  in  fact,  a  straight-forward  task  to  include  weapon  selection  in 
our  formulation  if  desired. 

3.7  Summary  and  Conclusions 

In  this  paper  we  develop  a  quick  and  accurate  algorithm  for  accurately  creating 
the  Pareto  optimal  frontier  in  the  multi-objective  airstrike  problem.  This  algorithm, 
which  leverages  specific  attributes  of  lethality  and  collateral  risk,  is  shown  to  rou¬ 
tinely  outperform  differential  evolution  and  enumeration  algorithms.  Once  Pareto 
optimal  solutions  are  found  these  can  be  quickly  converted  to  solutions  to  the  as¬ 
sociated  goal-programming  or  weighted  sum  scalarization  problems.  The  choice  of 
damage  function  is  shown  to  greatly  affect  the  expected  lethality  and  collateral  risk 
in  an  airstrike  underscoring  the  need  for  accurate  estimation  of  weapons  effects. 

We  demonstrate  that  the  current  methodology  of  not  using  offset  aiming  yields 
lethality  26%  higher  at  a  cost  of  collateral  risk  176%  higher  than  a  collateral  first 


approach.  The  algorithm  presented  can  be  incorporated  into  the  weapon  (and  em¬ 
ployment)  decisions  facing  an  airstrike  planner,  who  could  alter  selections  based  on 
the  minimum  lethality  needed  or  maximum  collateral  risk  allowed  to  remedy  the 
limitation  from  non-offset  targeting.  Future  work  will  see  the  application  of  our  al¬ 
gorithm  to  more  complicated  lethality  and  collateral  risk  models  such  as  the  JWS 
and  JMEM  tools  currently  used  by  the  USAF. 


IV.  Look-Look-Shoot:  Finite  and  Infinite  Horizon  Markov  Decision 
Policies  with  Limited  Intelligence 

4.1  Introduction 

In  fast  moving  troops-in-contact  (TIC)  situations,  information  is  often  subject 
to  the  fog  and  friction  of  war.  Forces  cannot  wait  for  perfect  information  and,  as  a 
result,  mistakes  are  made,  civilians  and  even  friendly  forces  are  killed.  However,  the 
alternative  of  waiting  for  “perfect”  information  before  an  airstrike  is  ordered  has  its 
own  set  of  consequences.  Ground  forces  may  be  pinned  down,  and  every  second  that 
goes  by  increases  their  likelihood  of  being  shot  or  killed  by  enemy  forces.  However, 
there  is  a  cost  to  making  the  wrong  decision  which  could  result  in  civilian  casualties, 
friendly  force  casualties  and  unnecessary  collateral  damage.  The  questions  then 
become,  “How  long  do  we  wait  for  perfect  information?”  or  “When  have  we  received 
enough  imperfect  information  to  make  a  decision?” 

In  these  situations,  there  will  always  be  a  trade-off  between  the  cost  of  civilian 
casualties  and  the  cost  of  losing  friendly  forces.  There  will  be  a  cost  of  abandoning 
friendly  forces  when  they  need  help,  there  will  be  an  opportunity  cost  of  air  support 
spending  time  in  an  engagement  that  does  not  truly  threaten  them,  and  there  will 
be  a  cost  of  killing  non-combatants. 

4-1.1  Data  Fusion.  A  goal  of  all  observers  in  a  conflict  is  to  correctly 
classify  the  nature  of  the  suspected  enemy  in  a  timely  manner  (as  time  costs  lives, 
money,  and  the  opportunity  to  support  other  engagements).  In  pre-planned  missions, 
where  the  nature  and  location  of  suspected  enemy  forces  are  well  known,  there  have 
been  relatively  few  civilian  casualties  in  recent  years  (only  two  pre-planned  missions 
resulted  in  civilian  deaths  from  2006  to  2007).  Conversely,  civilian  casualties  from 
TIC  situations  have  exploded  in  recent  years  (over  400  civilian  deaths  from  TICs  in 
2006-07).  TIC  situations  are  defined  by  the  lack  of  previous  intelligence  about  the 
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suspected  enemy,  the  location  of  friendly,  neutral  and  enemy  forces,  the  terrain,  and 
the  capabilities  of  friendly  and  enemy  forces. 

In  a  TIC  situation  there  may  be  a  variety  of  observers  attempting  to  determine 
the  true  nature  of  the  suspected  hostile  player.  UAVs  circling  overhead  sending 
images  of  a  building  back  to  image  analysts,  ground  forces  in  varying  locations 
relative  to  the  suspected  enemy,  and  air  support  pilots  viewing  the  scene  from  above 
will  all  give  a  unique  picture  of  the  battlespace.  Each  will  have  their  own  unique 
assessment  of  the  ground  scenario,  and  we  can  assume  that  each  of  them  has  a 
differing  likelihood  of  being  correct.  In  order  to  synthesize  these  perspectives  the 
held  of  data  fusion  must  be  introduced  to  the  problem  as  we  seek  to  get  the  most 
correct  information  out  of  the  imperfect  data  gathered  from  these  sources. 

Polikar  [81]  discusses  the  idea  of  ensemble  based  systems  in  decision  making, 
whereby  diverse  classifiers  making  individual  classifications  are  fused  together  to 
develop  a  cleaner  picture  of  an  unknown  event.  Polikar  gives  as  an  example  a  patient 
undergoing  tests  for  a  neurological  disorder,  who  might  undergo  an  MRI  scan,  EEG, 
blood  and  other  tests.  An  individual  test  alone  might  give  a  prediction  as  to  whether 
the  patient  has  the  disorder  or  not,  and  each  test  has  some  type  I  and  type  II  error 
involved  with  it.  The  reason  multiple  tests  are  performed  is  that  as  the  doctors  get 
varying  pictures  of  the  disorder,  they  will  make  a  more  robust  classification  yielding 
lower  type  I  and  type  II  errors.  We  seek  to  give  a  framework  for  applying  these  same 
ideas  to  TIC  classification. 

In  the  simplest  case,  the  observers  in  a  TIC  are  trying  to  determine  if  a  building 
or  group  of  individuals  constitute  a  legitimate  military  target.  The  suspected  enemy 
forces  might  truly  have  combatants  among  them,  but  often  the  multitude  of  civilians 
(non-combatants)  among  them  who  would  be  put  in  danger  with  an  airstrike  could 
outweigh  the  gain  of  killing  the  combatants.  Recent  military  guidance  has  instructed 
US  personnel  to  exit  situations  in  which  non-combatant  lives  are  in  danger,  even  if 
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it  means  disengaging  with  known  enemy  forces,  if  the  personnel  are  able  to  safely 
exit  the  environment  [39]. 

4-1-2  Decision  Making  with  Imperfect  Information.  All  decision  making 
takes  places  without  accurate  or  complete  information  on  the  outcome  of  the  decision. 
Often,  with  more  time  studying  a  decision,  better  information  comes  to  light  giving 
the  decision  maker  a  better  grasp  of  the  true  nature  of  the  problem  and  the  effects 
of  a  decision  [56]  [85]  [80].  While  studying  the  time  pressure  on  decision  making, 
Payne  [79]  notes,  “In  some  cases,  the  longer  the  delay  in  making  the  decision,  the 
lower  the  expected  return  (value)  from  even  the  most  accurate  of  decisions.”  The 
fact  is  that  we  cannot  always  afford  to  make  an  accurate  decision,  when  doing  so 
delays  making  a  “satisfactory”  decision. 

Decisions  are  routinely  made  in  a  sequential  manner.  Consider  the  stock  mar¬ 
ket,  where  once  an  investor  purchases  a  stock,  every  following  day,  he  may  choose 
whether  to  sell  that  stock  he  purchased,  buy  more  of  that  stock  he  purchased,  or 
simply  do  nothing.  Day  after  day,  a  decision  is  made,  and  the  optimal  decision  pro¬ 
cess  is  one  in  which  the  profit  made  on  the  investment  is  maximized  at  some  point 
in  the  future  and  the  decision  made  on  any  given  day  depends  on  all  of  the  decisions 
leading  up  to  that  point  (e.g.  the  investor  cannot  sell  stock  on  a  day  if  he  sold  all  of 
his  stock  the  day  prior). 

The  process  of  sequential  decision  making  can  be  analyzed  with  dynamic  pro¬ 
gramming  [31].  We  can  see  the  applicability  of  dynamic  programming  to  TIC  situa¬ 
tions,  where  after  receiving  incremental  information,  a  pilot  may  choose  to  either  fire 
upon  a  target,  continue  to  loiter  above  a  target  or  to  leave  the  situation  and  attend 
to  other  potential  targets.  However,  as  in  the  stock  example,  the  pilot’s  decision 
depends  on  the  decisions  he  has  previously  made.  A  target  cannot  be  struck  if  it  has 
been  previously  struck  (if  we  assume  a  strike  completely  destroys  the  target)  and  a 
pilot  cannot  fire  upon  a  target  if  his  previous  decision  was  to  leave  the  scene.  Just  as 
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with  the  investor,  the  pilot’s  goal  is  to  link  together  the  chain  of  decisions  which  will 
yield  the  best  outcome  at  some  defined  future  point.  In  order  to  find  that  decision 
chain  we  must  solve  a  dynamic  program. 

4-1.3  Dynamic  Programming.  Dynamic  programming  can  be  thought  of 
using  stages  and  states.  A  stage  is  a  discrete  point  in  time  at  which  a  decision  (uk)  is 
made  based  on  the  state  (xk)  of  the  system.  The  state  is  the  summary  of  all  decisions 
from  previous  stages  and  their  outcomes,  clue  not  only  to  the  decisions  made  but 
also  the  randomness  (w*,)  involved  with  moving  from  stage  to  stage.  Some  additive 
reward  is  gained  for  each  decision  made,  and  the  goal  is  to  maximize  the  sum  of 
the  rewards  over  the  time  horizon  of  the  problem.  Bertsekas  [13]  lays  out  the  main 
ingredients  of  a  basic  dynamic  programming  formulation  as: 

1.  A  discrete-time  system  of  the  form  Xk+i  =  fk(xk,Uk,u>k), 

2.  Independent  random  parameters, 

3.  A  control  constraint  (decision), 

4.  An  additive  cost  of  the  form  Eg^{x n)  +  Y^k=o  9k(xk,  Wk), 

5.  Optimization  over  policies  (rules  for  choosing  Uk  for  each  k  and  each  possible 
value  for  Xk )• 

Denardo  [31]  formulates  the  dynamic  programming  problem  as  Xk+i  =  fk(%k,  Wk), 
k  =  0, 1, . . . ,  N  —  1  where: 

k  indexes  discrete  time, 

Xk  is  the  state  of  the  system  and  summarizes  past  information  that  is  relevant 
for  future  optimization, 

Uk  is  the  control  or  decision  variable  to  be  selected  at  time  k , 

Wk  is  a  random  parameter, 

N  is  the  number  of  time  control  is  applied. 

For  the  Look-Look-Shoot  (LLS)  problem,  we  assume  that  each  stage  brings  about 
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more  data  relating  to  the  problem  assuming  the  pilot  continues  to  loiter.  The  rewards 
in  this  model  are  typically  assumed  to  be  costs,  either  the  cost  of  a  “look”  decision 
or  the  likelihood  of  being  incorrect  given  a  “leave”  or  “shoot”  decision. 

4-1-4  Markov  Decision  Process.  A  Markov  decision  process  (MDP)  is  a 
problem  in  which  there  is  a  decision  maker,  a  finite  number  of  policies  or  choices 
the  decision  maker  can  choose,  a  transition  probability  matrix  which  defines  the 
likelihood  of  the  next  state  given  the  current  state  and  policy,  a  transition  reward 
matrix  which  indicates  the  current  reward  gained  for  the  state  and  policy,  and  a 
performance  metric  based  on  the  rewards  gained  during  the  stages  of  the  MDP  [42], 
S,  a  finite  state  space  of  possible  system  states.  A  realization  of  the  random  variable 
S  is  denoted  by  s. 

A,  a  finite  set  of  actions.  A  realization  of  the  random  variable  A  is  denoted  by  a. 
An  action  a  causes  transitions  from  the  current  state  to  some  new  state. 

T  :  S  x  A  x  S  — >  1R [0,1]  is  the  state-transition  function,  giving  the  probability  that  the 
agent  transit  to  state  s'  when  it  is  in  state  s  and  takes  action  a.  In  other  words,  the 
transitions  specify  how  each  of  the  actions  and  exogenous  events  change  the  state  of 
the  world.  We  denote  by  T(s,a,s')  =  P(s'\a,s )  this  probability.  We  have  for  each 

A  T,S' p(s'\a>s)  =  L 

R  :  S  x  A  — y  R  is  the  reward  function  giving  the  expected  immediate  reward  gained 
by  the  agent  for  taking  each  action  in  each  state. 

4-1-5  Partially  Observable  Markov  Decision  Process.  When  the  agent  is 
unsure  of  the  state  s  that  he  is  currently  in,  unlike  the  MDP  where  the  agent  knows 
where  he  is  at  all  times,  this  problem  becomes  a  POMDP.  There  is  some  probability 
distribution  around  the  state  in  which  the  agent  thinks  he  is  in  (the  belief  state  b ). 

O  :  S  x  A  — »  II(n)  is  the  observation  function,  which  gives,  for  each  action 
and  resulting  state,  a  probability  distribution  over  possible  observations  (we  write 
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0(s',a,o)  for  the  probability  of  making  observation  o  given  that  the  agent  took 
action  a  and  landed  in  state  s'  [72].) 

Monahan  [72]  introduces  a  system  where  one  of  three  decisions  may  be  made. 
Either  the  observer  can  “inspect”  -  attempt  to  observe  the  true  state  of  the  target 
another  time  (at  a  cost),  “stop”  -  make  a  determination  as  to  the  true  state  of  the 
target  and  have  no  option  for  further  observation,  or  “continue”  in  which  he  moves 
to  the  next  time  interval  (at  a  cost)  where  the  same  three  options  will  again  be 
available  to  him.  In  the  next  time  interval  there  is  some  probability  that  the  nature 
of  the  target  has  changed  which  is  a  difference  from  the  assumptions  in  this  paper. 
McAllister  [68]  further  notes,  “While  the  Markovian  property  does  not  hold  for  the 
state  of  the  system,  it  does  hold  for  the  belief  state  of  the  system.  The  optimal  policy 
for  any  given  stage  is  only  dependent  on  the  current  belief  state  and  not  decisions 
made  in  previous  stages.” 

4- 1.5.1  Two-State  Belief  State.  When  trying  to  classify  a  suspected 
target  in  a  TIC  situation,  we  are  concerned  with  classifying  the  target  as  “legitimate” 
or  “illegitimate”,  hence  the  two-state  belief  state  (we  believe,  with  some  likelihood 
that  the  target  is  “legitimate”  or  “illegitimate”).  In  a  POMDP  with  a  two-state 
belief  state,  the  likelihood  of  being  in  either  of  the  two  states  can  be  expressed  by 
the  belief  (p)  of  being  in  one  state  or  the  other,  since  the  likelihood  of  being  in  the 
other  state  is  1  —  p.  In  Figure  53,  the  current  belief  state  is  expressed  as  p,  which  is 
the  observer’s  belief  that  the  true  nature  of  the  state  is  si .  As  the  observer  becomes 
more  confident  in  Si  being  the  true  nature  of  the  state,  then  p  will  increase,  and, 
similarly,  if  the  observer  becomes  more  confident  that  S2  is  the  true  nature  of  the 
state  then  p  will  decrease  (as  s2  becomes  the  more  likely  true  state).  Thus,  our  belief 
state  can  be  written  as  b  =  (p,  1  —  p)  yielding  b(s  1)  =  p  and  6(^2)  =  1  —  p  =  q  (note: 
6(si)  +  b(s2 )  =  1). 


95 


current 
bei  ef  state 


Figure  53:  Two-State  Belief  State 


In  a  simple  example,  depicted  in  Figure  54,  initially  the  observer  believes  that 
the  true  state  is  equally  likely  to  be  in  Si  or  S2 ■  The  observer  knows  that  each 
observation  has  a  75%  likelihood  of  being  correct,  thus,  if  he  observes  ,s  1 ,  then  his 
belief  state  is  b  =  (0.75,  0.25)  and  if  he  observes  S2,  then  his  belief  state  becomes 
b=  (0.25,0.75). 

0=S2  0=Si 

1  <~Hf— i  1 

0  p-0.25  p-0.5  p-0.75  1 

c  q-0.75  q-0.5  q-0.25  c 

b2 

Figure  54:  Belief  State  after  First  Observation 


A  complete  policy  for  a  POMDP  is  the  optimal  policy  for  each  possible  belief 
state  [50].  The  optimal  policy  for  a  given  stage  is  then  only  dependent  on  the 
belief  state  at  that  stage  and  not  decisions  made  during  previous  stages  (the  Markov 
condition).  Of  particular  note,  a  belief  space  may  be  partitioned  into  more  regions 
than  actions,  meaning  that  one  action  can  be  optimal  for  disparate  regions  of  the 
belief  space  [50]  [73]  [72], 

Working  from  Bayes’  Theorem,  the  new  belief  state  at  a  new  stage  is: 


Q(s',  a,  o )  Esgs  T(g>  a,  s')b(s) 
P(o|a,  b) 


(28) 
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with  P(o\a,b)  =  ^seS[0(s/,  a,  o)  ^sg5  T(s,  a,  s')6(s)]  indicating  that  future  belief 
state  is  a  function  of  both  the  current  stage’s  belief  state,  the  action  taken  during 
the  current  stage,  and  the  observation  made. 

4-1.6  Shoot-Look-Shoot.  Glazebrook  and  Washburn  [41]  present  a  review 
of  the  “Shoot-Look-Shoot”  problem  in  which  a  marksman  is  required  to  kill  a  given 
number  of  targets.  Once  the  marksman  shoots,  he  looks  to  see  if  the  target  has  been 
killed,  and  then,  if  the  target  hasn’t  been  killed,  he  may  choose  to  shoot  again  at 
that  same  target.  The  problem  gets  further  complicated  by  imperfect  information 
wherein  the  marksman  may  get  possibly  incorrect  information  as  to  the  alive/dead 
status  of  the  previous  target.  Glazebrook  and  Washburn  view  the  problem  as  a 
Markov  decision  process  and  use  a  stochastic  dynamic  programming  approach  to 
develop  the  marksman’s  optimal  strategy. 

A  difference  between  the  “Look-Look-Shoot”  problem  and  the  “Shoot-Look- 
Shoot”  problem  is  that  LLS  allows  for  only  one  shooting.  We  assume  that  when 
a  target  is  aimed  at,  it  is  completely  destroyed;  battle  damage  assessment  is  not 
implemented.  In  the  LLS  problem,  imperfect  information  plays  a  role  when  the  pilot 
is  unsure  whether  a  target  is  a  legitimate  military  target  or  not. 

Jh2  Finite  Horizon 

When  viewing  the  LLS  problem  as  having  a  finite  horizon  we  allow  for  only  a 
given  number  of  stages,  at  which  time  the  TIC  situation  has  been  resolved,  either 
by  firing  upon  the  target  or  the  air  support  leaving  the  situation.  The  finite  horizon 
problem  lends  itself  to  being  solved  through  the  dynamic  programming  method  of 
recursive  fixing  starting  from  the  final  time  period  and  incrementally  making  decision 
backwards.  We  implement  this  method  in  both  cases  where  the  quality  (likelihood 
of  being  correct)  of  information  is  constant  across  stages  and  where  information 
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improves  as  we  move  through  the  stages  as  more  surveillance  and  intelligence  become 
available  in  the  TIC  situation. 

We  will  further  assume  that  information  arrives  at  set  intervals  each  one  unit 
of  time  apart.  The  inputs  to  the  problem  then  become:  the  information  distribution 
as  a  product  of  time,  the  cost  of  waiting  one  cycle  for  more  information  (cw),  and  the 
cost  of  striking  a  building  which  is  not  a  legitimate  military  target  (cs).  Further  cw  < 
cs  since  otherwise  there  would  never  be  an  incentive  to  wait  for  more  information  also 
the  choices  for  action  at  each  stage  is  either  “L”  (look,  make  another  observation), 
“S”  (shoot,  fire  upon  the  target  ending  the  scenario)  or  “X”  (exit,  leave  the  scenario 
without  firing).  Note:  only  the  “L”  choice  results  in  future  stages. 

4-2.1  Stationary  Information.  With  a  stationary  information  assumption, 
information  at  each  observation  has  the  same  probability  of  being  correct.  For  exam¬ 
ple,  past  observations  in  other  TIC  environments  may  point  to  correct  classification 
being  75%,  where  we  assume  an  observer  correctly  classifies  the  target  as  “legitimate” 
or  “illegitimate”  75%  of  the  time. 

In  the  binary  decision  for  stationary  information  such  as  determining  the  nature 
of  the  target,  the  likelihood  of  correct  classification  depends  only  on  the  difference  in 
the  number  of  “legitimate”  and  “illegitimate”  observations  made  at  a  point  in  time. 
For  instance,  if  there  are  three  out  of  five  observations  that  result  in  a  “legitimate” 
determination,  then  the  likelihood  of  this  object  truly  being  legitimate  is  exactly 
the  same  as  if  two  out  of  three  observations  yield  “legitimate”  calls.  (In  this  case, 
both  have  0.5  observations  above  the  50%  level.)  The  likelihood  then  of  correct 
classification  is  p2n / (p2n+q2n)  where  n  is  the  number  of  observations  above  (or  below) 
50%.  Clearly,  as  n  approaches  infinity,  then  the  likelihood  of  correct  classification 
approaches  1. 

Further,  we  can  determine  the  likelihood  of  a  future  observation  agreeing  with 
previous  observations.  In  a  simple  case,  assume  that  p  =  0.75,  what  is  the  likcli- 


hood  of  the  second  observation  agreeing  with  the  first  observation?  The  answer  is 
the  likelihood  that  they  are  both  wrong  pins  the  likelihood  they  are  both  correct 
(' p 2  +  q2  =  0.752  +  0.252  =  0.625)  and  the  likelihood  that  they  disagree  is  then 
1  —  0.625  =  0.375.  Moreover,  if  we  know  that  the  previous  observations  (regardless 
of  how  many  correct  and  incorrect  observations  have  been  made)  yield  a  0.75  proba¬ 
bility  of  being  correct,  then  the  next  observation  will  agree  with  the  prior  consensus 
62.5%  of  the  time  and  disagree  37.5%.  We  can  further  prove  (see  Appendix  B) 
that  with  information  improvement  over  time,  one  would  never  shoot  immediately 
following  an  “illegitimate”  call. 

4-2.2  Improving  Information.  There  are  reasons  to  believe  that  the  in¬ 
formation  gathered  in  successive  stages  will  improve  due  to  more  intelligence  assets 
being  placed  in  the  scenario,  whether  in  the  form  of  more  ground  forces  entering 
the  TIC  scenario,  more  UAVs  being  moved  into  the  environment  or  more  air-to- 
ground  fighters  joining  the  TIC.  With  this  assumption,  better  information  becoming 
available  will  typically  resolve  the  nature  of  the  suspected  target  more  quickly. 

4-2.3  Recursive  Fixing.  In  the  finite  horizon  scenario  for  either  stationary 
or  improved  information,  as  the  horizon  stage  (the  last  stage  considered)  increases 
linearly  the  number  of  possible  strategies  increases  exponentially.  For  instance,  if 
there  is  only  one  stage,  then  our  choices  are  to  “shoot”  or  “leave”  depending  on 
whether  the  first  observation  is  “legitimate”  or  “illegitimate” .  That  is,  we  could  have 
an  optimal  strategy  of  (S,S),  (S,X),  (X,S),  or  (S,S)  for  the  two  possible  outcomes 
of  the  observation.  For  the  case  where  there  are  two  observations  made,  we  could 
have  many  more  strategies  since  now  a  strategy  after  the  first  observation  may  be  to 
look  “L”,  giving  three  possible  actions  after  the  first  observation.  Further,  we  add 
another  round  of  observations  and  decisions  when  making  a  second  observation  and 
this  grows  the  number  of  possible  strategies  exponentially  as  the  stages  increase  (see 
Figure  55),  note  that  possible  strategies  equals  5n_1  where  n  is  the  horizon  stage. 
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Figure  55:  Possible  Strategies  by  Horizon 

Again,  we  will  assume  that  information  arrives  at  set  intervals  each  one  unit 
of  time  apart.  The  inputs  will  remain  the  information  distribution  as  a  product 
of  time,  the  cost  of  waiting  one  cycle  for  more  information  (cw),  and  the  cost  of 
striking  a  building  which  is  not  a  legitimate  military  target  (cs).  Let’s  assume, 
for  example,  that  information  follows  the  cumulative  distribution  function  of  the 
geometric  distribution  with  a  probability  of  p.  Thus,  the  likelihood  of  getting  correct 
information  on  the  first  look  is  p,  on  the  second  look  it’s  1  —  (1  —  p)2  and  on  the 
third  look  1  —  (1  —  p)3  (that  is,  on  the  nth  look  the  likelihood  of  correct  information 
is  1  —  (1  —  p)n).  Further,  we  will  assume  that  each  look  is  independent  of  any  other 
look. 

We  will  assume  that  our  a  priori  assumption  is  that  upon  arrival  at  a  TIC, 
a  target  is  equally  likely  to  be  legitimate  or  illegitimate.  Further,  if  we  assume  a 
finite  horizon  of  only  one  look,  then  the  problem  is  simple  to  solve  if  we  assume  that 
a  target  is  equally  likely  to  be  legitimate  or  illegitimate.  If  upon  the  first  look  the 
target  is  called  “legitimate”  then  if  (y  >  the  target  should  be  fired  upon  and 
if  (y  <  then  no  strike  should  take  place.  In  a  similar  manner,  if  the  weapon  is 
called  “illegitimate”  and  —  >  — -  then  no  strike  should  take  place  and  —  <  — - 
indicates  that  a  strike  should  still  take  place. 
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With  a  finite  time  horizon  and  two  looks,  the  problem  is  more  complicated 
since  the  two  looks  may  yield  different  responses.  For  instance,  the  first  look  may 
call  the  target  “legitimate”  and  the  second  calls  it  “illegitimate”.  By  our  assump¬ 
tion  the  second  look  is  more  accurate  but  is  mitigated  by  the  first  look  yielding  a 
“legitimate”  call.  For  simplicity,  call  the  likelihood  of  correct  information  at  the  nth 
look  L(p,  n),  where  L(p,  n)  —  1  —  (1  —p)n  and  L'(p,  n)  =  (1  —  p)n  is  the  likelihood  of 
the  nth  look  being  incorrect.  If  both  calls  are  “legitimate”  then  the  likelihood  of  the 
target  being  legitimate  is  2)  •  ^  the  hrst  call  is  “legitimate”  and 

the  second  call  is  “illegitimate”  then  the  likelihood  of  the  first  call  being  correct  is 
L(p  vislp'^+L^p^Ltp  2)  •  Now,  we  begin  to  see  the  dynamic  programming  formulation 
of  the  finite  horizon  LLS  problem.  The  cost  of  waiting  must  not  only  include  cw,  but 
must  also  include  future  expected  costs  of  cw  and  cs. 

If  we  consider  “S”  and  “X”  to  be  the  same  action  (both  of  which  end  the 
situation)  then  the  finite  horizon  network  with  two  observations  looks  like: 


Figure  56:  Two-Look  Horizon 
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Again,  assume  a  two-look  horizon  where  a  target  is  equally  likely  to  be  le¬ 
gitimate  or  illegitimate.  We  can  then  calculate  the  likelihood  of  a  “legitimate”  or 
“illegitimate”  call  at  each  stage  given  previous  calls.  At  look  1  (LI)  the  likelihood  of 
a  “legitimate”  call  is  0.5.  The  likelihood  of  L2  being  the  same  as  LI  is  p(  1  —  q 2)  +  q 3 
with  the  likelihood  of  being  contradictory  at  1  —  (p(  1  —  q2)  +  q3).  Further,  the  cost 
of  never  shooting  equals  2 cw,  the  cost  of  waiting  and  then  shooting  being  cw  +  PiCs, 
and  the  cost  of  shooting  immediately  after  the  first  look  being  PiCs.  While  cs  and 
cw  are  constant,  p\  (the  probability  of  the  target  being  illegitimate)  changes  as  we 
get  more  or  different  observations.  For  example,  Pi  —  q  if  we  shoot  after  LI  returns 
“legitimate”,  however,  p,  =  p  if  we  shoot  after  LI  returns  “illegitimate”.  For  the 
two-horizon  problem,  shooting  after  both  looks  return  “legitimate”  yields: 

Pi  =  W, =  g3+p(i-q2)  •  Shootin§ after  L1=  “F  and  L2  =  “L” results 

in: 

...  _  L'(p,l)L'(p,2)  _  fit? 

"i  L(p,l)L(p,2)+L'(p,l)L'(p,2)  q(l-q2)+pq2  ' 

If  we  assume  p  =  0.75,  then  LI  =  L2  =  “L”  yields  pz  =  A  whereas  if  LI  =  “I" 
and  L2  =  “L”  results  in  pt  —  Then,  if  after  two  looks,  (A  >  then  one 

would  shoot  after  if  L2  =  “L”,  if  then  one  should  shoot  if  LI  =  L2  =  “L”,  and  not 
shoot  otherwise.  Determining,  what  to  do  after  the  first  look  is  more  complicated, 
since  we  must  incorporate  the  expected  cost  after  the  second  look. 

If  LI  =  “L”  our  cost  will  be: 

minjgcy,  cw  +  (p(  1  -  qw)  +  q3)  ■  min{cw,  ca(g3+^1_g2))}  +  ctl,(l-p(l  -  q2)  -  g3)}.  Using 
the  p  =  0.75  assumption,  this  breaks  down  to  min{ ,  cw  +  ||  •  niin{c„,,  ||  + 

If  ^  >  |,  then  min{^,  cw  +  +  %^}  =  min{^,  2c„,}  =  f  meaning  we  would 

shoot  after  LI  =  “L”.  If  ^  <  pr  <  then  min  cw  +  ||  ■  min{c„,,  ||}  +  ^  = 
min  (I1,  cu,  +  p  '  f|  +  =  min{  f ,  +  ||},  so  since  <  |  <  ||  then  we  would 

wait  to  shoot. 

If  LI  =  “I”,  then  our  cost  will  be  cw  +  cw(p(  1  —  q2)  +  q3)  +  (1  —  p{l  —  q2)  — 
q3)-mm{cw,cSq(1j/)+pqS}. 
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Figure  57:  Two-Look  Horizon  Policy  (p  =  0.75) 

4-3  Infinite  Horizon  with  Stationary  Information 

The  infinite  horizon  problem  cannot  be  solved  by  recursive  fixing  since  there 
is  no  final  stage.  However,  for  stationary  information,  we  can  rely  upon  the  Markov 
attributes  of  the  problem  in  order  to  recursively  solve  the  TIC  problem. 

4-3.1  Building  the  Transition  Matrix.  In  the  stationary  information  case, 
we  are  able  to  exploit  the  Markov  nature  of  the  problem  since  the  likelihood  of  a 
target  being  correctly  identified  as  a  friend  or  foe  depends  only  on  the  difference  in 
the  number  of  “legitimate”  and  “illegitimate”  observations  made  up  to  that  stage. 
The  state  at  a  given  time  is  identical  to  the  belief  state  at  that  time,  and  not  the 
number  of  observations  made  to  that  point  (this  is  not  the  case  for  improving  infor¬ 
mation).  Further,  we  know  the  likelihood  of  the  next  observation  being  “legitimate” 
or  “illegitimate”.  Let  b  be  the  belief  state  at  a  point  in  time,  where  b  is  the  greater  of 
the  probability  that  the  target  is  “legitimate”  or  “illegitimate”,  then  the  likelihood 
that  the  next  observation  agrees  with  the  prevailing  belief  is  b  *  p  +  (1  —  b)  *  (1  —  p). 
As  b  grows  larger,  the  likelihood  of  the  next  observation  agreeing  with  the  prevailing 
belief  increases  (with  a  limit  of  p). 

Therefore,  we  can  make  a  transition  matrix  based  on  the  belief  state  (expressed 
in  terms  of  the  difference  between  the  number  of  “legitimate”  and  “illegitimate”  calls) 
prior  to  the  current  stage  (i.e.  Pi  may  represent  5  “L”s  and  4  “I”s  or  1  “L”  and  0 
“I”s).  From  P\ ,  for  example,  we  can  only  move  to  P0  or  P2.  The  transition  matrix, 
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which  is  infinitely  large,  becomes: 


P-1,-1 

P-1,0 

P-1,1  ••• 

p  = 

■  ■  ■  Po,-l 

Po,0 

Po,l  ■  ■  ■ 

■  ■  ■  Pi,- 1 

Pi,0 

Pi,  1  ■  ■  ■ 

(29) 


with  Yl'jL-oo  P'-3  =  MeZ  and  Pid  =  0  if  \i  -  j\  ±  1. 

If  we  choose  to  truncate  the  matrix  at  a  given  point  (in  this  case,  looking  from 
i  and  j  from  -3  to  3)  and  we  insert  the  transition  probabilities  then  the  result  is: 
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Since  Pt,3  =  P-i-j  we  can  convert  the  transition  matrix  to: 
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(31) 


Now  that  we  have  described  the  transition  matrix  we  can  use  it  to  guide  the  optimal 
strategy  determination.  If  the  likelihood  of  incorrectly  identifying  the  target  as  a 


104 


legitimate  target  is  below  the  cost  of  waiting  divided  by  the  cost  of  shooting  at  an 
illegitimate  target  then  we  would  always  fire.  That  is,  if  pi  =  1  —  pi  <  cw/cs  then  we 
would  fire  upon  the  target. 

By  looking  at  the  limiting  behavior  of  the  transition  matrix  (p(°°));  we  then 
can  see  the  likelihood  of  the  target  truly  being  legitimate  or  illegitimate  based  on  the 
previous  observations.  Again,  if  we  had  only  one  observation  and  it  was  “legitimate” 
then  the  likelihood  of  that  target  truly  being  legitimate  is  0.75  (the  same  is  true  if 
we  had  two  “legitimate”  calls  and  one  “illegitimate”  call).  Thus  the  likelihood  of 
being  correct  based  on  the  difference  in  the  number  of  observations  is 


p2  _p_  o  5  -2-  p2 

p2+q 2  p+q  '  p+q  p2+q2 


(32) 


supported  by  pi  =  1  —  pi  <  ^  determination,  which  prescribes  that  if  pqpp  <  p2 
then  we  will  fire,  thus  negating  the  need  for  further  observations.  Once  we  find  the  i 
for  which  “S”  or  “X”  (i.e.  p?+qi  <  (A)  is  the  optimal  policy  then  we  can  recursively 
find  the  optimal  strategy  for  any  i. 
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4-3.2  Mean  Time  Spent  in  Transient  States.  Assuming  we  have  found  the 
terminating  state  i,  then  we  will  determine  the  optimal  policy  recursively  for  state 
i  —  1,  i  —  2,  and  so  on  until  reaching  0.  To  accomplish  this  we  rnnst  know  the 
relative  costs  of  actions  in  state  i  —  1.  In  state  i  —  1,  we  will  know  the  cost  associated 
with  the  “S”  or  “X”  action  in  state  i,  but  we  need  to  know  the  expected  number 
of  observations  necessary  to  reach  state  i  from  i  —  1  (or  the  mean  time  spent  in 
transient  states  (0,1  —  1))  [85]. 

P 1 1  P 12  Pit 

pT=  .  ;  :  ;  (34) 

Ptl  Pt2  ‘  ‘  ‘  Ptt 

Sn  .S' 12  •  •  •  Sit 

s  ;  ;  ;  ;  (35) 

SfA  st2  '  ‘  '  Stt 

S  =  I  +  PrS  (36) 

(I-PT)S  =  I 

S  =  (I-Pt)-1 

Assuming  i  =  5,  p  =  0.75,  then  our  S  matrix  would  be: 

1.98  2.62  2.13  1.84 
0.98  2.62  2.13  1.84 
S  =  (I-PT)“1  =  0.38  1.02  2.13  1.84 
0.13  0.34  0.70  1.84 
0.03  0.09  0.18  0.47 


1.34 

1.34 

1.34  (37) 

1.34 
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The  S  matrix  shows  that  if  we  are  in  state  i  —  1  =  4,  then  we  will  need 
0.03  +  0.09  +  0.18  +  0.47  +  1.34  =  2.11  more  observations  on  average  to  reach  the 
absorbing  state  i  where  we  know  that  the  optimal  strategy  is  “X”  or  “S” . 

4-3.3  Constructing  the  Optimal  Policy.  Let  s*  =  Yln=isin  which  indicates 
the  expected  number  of  transitions  from  the  ith  state  until  reaching  the  terminating 
state.  If  SiCw  >  then  the  optimal  policy  would  be  to  shoot  upon  reaching 

state  i.  Further,  since  there  is  uncertainty  still  at  the  terminating  state  equal  to 
pt+Pi+qt+i  then  if  SiCw  +  pt+qi^t+1  cs  >  ~^cs  the  optimal  policy  is  to  shoot,  in  this 
example  p  =  0.75,  and  the  ratio  of  cw  to  cs  is  0.01. 

The  infinite  horizon  scenario  for  the  LLS  can  be  mapped  to  the  tiger  problem 
presented  in  [50],  where  the  observer  must  choose  between  continuing  to  listen  for  a 
tiger  (at  a  small  cost)  or  open  a  door  revealing  a  tiger  (or  fortune)  at  a  large  cost  or 
reward.  Once  the  observer’s  belief  state  reaches  a  certain  level  of  confidence,  then 
he  would  choose  to  open  the  door.  This  problem  has  a  fairly  simple  structure  for  the 
optimal  decision  policy,  where  the  three  choices  are  to  open  the  left  door,  continue 
to  listen,  and  open  the  right  door,  are  three  non-overlapping  segments  of  the  [0,1] 
belief  state.  As  a  reminder,  though  there  may  be  only  three  possible  decisions  in 
an  POMDP,  the  number  of  ranges  where  one  decision  is  optimal  is  infinite.  The 
challenge  then  is  to  determine  the  two  points  in  the  belief  state  where  we  move  from 
“leave”  to  “look”  and  from  “look”  to  “shoot”.  In  the  case  where  the  cost  of  firing 
upon  an  invalid  target  equals  the  cost  of  leaving  when  the  target  is  valid,  then  the 
two  points  are  equal  distances  from  0  and  1  (i.e.  the  two  points  add  to  1). 

4-4  Results 

Now  that  we  can  construct  the  optimal  policy  in  the  infinite  horizon,  stationary 
information  problem,  we  investigate  the  effects  of  the  parameters  of  the  problem  on 
the  average  time  to  make  decisions  and  the  average  cost  of  a  TIC  situation. 
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4-4-1  Effects  of  Intelligence  on  Decisions.  In  Figure  58  we  view  the  effects 
of  intelligence  accuracy  on  the  observation  difference  necessary  to  make  an  “S”  or 
“X”  decision.  In  this  example,  we  assume  that  cw/cs  is  0.01.  We  see  that,  as 
one  might  expect,  extremely  accurate  intelligence  leads  to  a  very  small  number  of 
observations  necessary  to  make  an  “S”  or  “X”  decision  (in  the  case  where  p  =  0.99, 
we  would  immediately  make  an  “S”  or  “X”  determination).  However,  of  interest, 
is  that  extremely  poor  information,  such  as  p  =  0.5,  results  in  a  similar  choice  of 
“S”  or  “X”  after  the  first  observation.  This  basically  tells  us  that  our  information  is 
so  unreliable,  that  waiting  for  more  “bad”  information  will  do  us  no  good.  We  will 
have  no  improved  belief  state  with  time. 
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Quality  of  Information 


Figure  58:  Quality  of  Information  vs.  Observations 

4-4-2  Effects  of  a  priori  Information  on  Decisions.  Up  to  this  point,  the 
assumption  has  been  that  the  observer  is  equally  likely  to  see  either  a  valid  or  invalid 
target.  However,  if,  based  on  experience,  historical  evidence  points  to  a  different 
ratio  of  valid  to  invalid  targets  then  we  can  incorporate  this  into  the  model.  A  priori 
information  will  have  the  same  effect  as  previous  observations  would  have.  A  priori 
information  will  give  the  initial  belief  state  of  the  system,  whereas  before,  the  initial 
belief  state  was  (0.5,  0.5),  if  the  accuracy  of  the  a  priori  information  is  pp,  then  our 
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initial  belief  state  will  be  (pp,  1  —pp)  or  (1—  pp,pp).  The  same  technique  of  finding  the 
absorbing  state  and  recursively  setting  the  optimal  policy  will  apply.  The  transition 
matrix  will  reflect  the  a  priori  information  where  the  entries  to  the  left  and  right  of 
the  diagonal  are  multiplied  by  2 pp  and  2(1  —  pp )  or  vice  versa. 

4-4-3  Effects  of  Weights  on  Decisions.  In  Figure  59,  we  vary  the  ratio  of  cw 
to  cs  from  0.1  to  1.0  to  see  the  effect  of  the  relative  costs  on  our  decision  threshold. 
As  the  cost  of  waiting  decreases  relative  to  the  cost  of  either  firing  on  an  illegitimate 
target  or  not  firing  on  a  legitimate  target  the  number  of  observations  necessary  to 
make  a  “S”  or  “X”  determination  increases. 


Probability  of  Correct  Information 


Figure  59:  Effect  of  Changing  Costs 

Further,  we  have  assumed  that  the  cost  of  shooting  and  the  cost  of  leaving 
(reaching  the  conclusion  that  the  observer  no  longer  waits)  were  equal.  However,  if 
the  cost  of  shooting  and  the  cost  of  leaving  were  unequal,  then  the  transition  matrix 
would  have  to  be  altered.  Additionally,  this  would  create  two  different  levels  for  the 
two  absorbing  states  (the  points  where  the  observer  would  definitely  leave  the  scene 
or  definitely  fire  upon  the  target). 
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The  findings  of  this  chapter  are  based  on  a  multitude  of  assumptions  in  an 
effort  to  keep  this  material  unclassified.  However,  the  sensitivity  analyses  provide 
understanding  of  the  various  factors  at  play  in  these  TIC  situations. 

4-5  Conclusion  and  Future  Work 

In  this  chapter,  we  have  provided  a  framework  for  making  optimal  policy  de¬ 
cisions  in  fast-moving  TIC  situations  where  observers  are  unsure  of  the  nature  of 
possible  enemy  forces  in  both  finite  horizon  and  infinite  horizon  problems.  Through 
the  recursive  technique  of  solving  this  Markov  decision  process  we  have  demonstrated 
the  effect  of  improved  intelligence  and  differing  weights  concerning  waiting  and  mak¬ 
ing  incorrect  decisions  in  the  face  of  uncertain  situations. 

Future  work  involves  creating  heuristics  for  solving  the  TIC  problem  for  the  in¬ 
finite  horizon  with  improving  information.  In  these  situations,  the  Markov  property 
will  not  hold  limiting  the  ability  to  apply  many  of  the  techniques  in  this  paper.  Addi¬ 
tionally,  making  the  problem  more  real-world  reflective  will  lead  to  more  complicated 
cost  and  decision  parameters. 
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V.  Summary,  Future  Work,  and  Conclusions 

5. 1  Summary  of  Original  Contribution 

In  this  dissertation,  a  characterization  of  the  distribution  of  supply  airdrops 
and  methods  for  optimally  dropping  them  is  presented.  Specifically,  supply  air¬ 
drops  follow  a  bivariate  normal  distribution  in  which  the  x  and  y  deviations  are 
uncorrelated  (p  =  0).  A  surrogate  approximation  function  for  the  bivariate  nor¬ 
mal  distribution  supports  quick  integration  of  the  distribution  to  assess  drop  risk. 
RSM  with  surrogate,  and  DE,  both  return  Pareto  optimum  results  depending  on  a 
tradeoff  between  runtime  and  accuracy.  Both  achieve  near-optimal  solutions  of  the 
non-linear  program  resulting  from  the  airdrop  problem,  quickly  finding  settings  for 
both  airdrop  location  and  approach  angle.  Enumeration  is  strongly  dominated  by 
all  other  algorithms. 

We  suppose  an  airdrop  planner  who  has  been  shown  the  oval  shapes  and  scales 
of  a  bundled  set  of  a  supply  airdrop  could  predict  the  optimal  aimpoint  within  50 
meters  in  each  direction  and  the  drop  angle  within  5  degrees  angle  of  the  optimal 
solution.  Note  that  this  is  a  high  standard  -  we  have  looked  at  hundreds  of  combi¬ 
nations  of  drops,  yet  still  only  approach  that  level  of  accuracy.  In  our  base  problem, 
where  the  collateral  objects  have  the  same  weighting,  the  planner  “eyeballing”  a 
solution  would  have  a  collateral  risk  14%  higher  than  the  optimum.  In  more  com¬ 
plicated  scenarios  where  the  collateral  objects  are  weighted  differently,  “eyeballing” 
a  solution  becomes  much  worse  than  the  solutions  found  by  our  algorithms,  with 
“eyeballed”  solutions  routinely  worse  by  20%  or  more.  A  more  reliable  technique 
must  be  implemented  to  limit  damage  and  ensure  recoverability. 

Additionally,  a  quick  and  accurate  algorithm  for  accurately  creating  the  Pareto 
optimal  frontier  in  the  multi-objective  airstrike  problem  is  presented.  This  algorithm, 
which  leverages  specific  attributes  of  lethality  and  collateral  risk,  is  shown  to  rou¬ 
tinely  outperform  differential  evolution  and  enumeration  algorithms.  Once  Pareto 
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optimal  solutions  are  found  these  can  be  quickly  converted  to  solutions  to  the  as¬ 
sociated  goal-programming  or  weighted  sum  scalarization  problems.  The  choice  of 
damage  function  is  shown  to  greatly  affect  the  expected  lethality  and  collateral  risk 
in  an  airstrike  underscoring  the  need  for  accurate  estimation  of  weapons  effects. 

We  demonstrate  that  the  current  methodology  of  not  using  offset  aiming  yields 
lethality  26%  higher  at  a  cost  of  collateral  risk  176%  higher  than  a  collateral  first 
approach.  The  algorithm  presented  can  be  incorporated  into  the  weapon  (and  em¬ 
ployment)  decisions  facing  an  airstrike  planner,  who  could  alter  selections  based  on 
the  minimum  lethality  needed  or  maximum  collateral  risk  allowed  to  remedy  the 
limitation  from  non-offset  targeting. 

Finally,  we  provide  a  framework  for  making  optimal  policy  decisions  in  fast- 
moving  TIC  situations  where  observers  are  unsure  of  the  nature  of  possible  enemy 
forces  in  both  finite  horizon  and  infinite  horizon  problems.  Through  the  recursive 
technique  of  solving  this  Markov  decision  process  we  have  demonstrated  the  effect  of 
improved  intelligence  and  differing  weights  concerning  waiting  and  making  incorrect 
decisions  in  the  face  of  uncertain  situations. 

5.2  Future  Work 

The  future  work  for  the  research  presented  in  this  dissertation  will  be  modifying 
the  algorithms  and  theory  to  real-world  software  and  application.  Tools  currently 
in  use  by  the  USAF  have  more  complicated  inputs  for  both  airdrop  and  airstrike 
collateral  estimates.  While  the  tools  being  implemented  today  provide  more  accuracy 
than  the  assumptions  in  this  work,  they  all  seem  to  lack  the  optimization  step  that 
is  necessary  to  truly  lower  collateral  risk. 

5.3  Conclusions 

The  importance  of  collateral  damage  minimization  in  U.S.  engagements  around 
the  world  is  undeniable.  While  steps  have  been  taken  to  estimate  collateral  risks  for 
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airdrop  and  airstrike  missions,  there  has  been  little  done  to  minimize  this  collateral 
risk  efficiently.  For  airdrops,  there  is  no  tool  available  to  find  optimal  locations 
within  a  drop  area  to  avoid  collateral  risk  while  ensuring  recoverability,  typically, 
trained  mission  planners  look  for  areas  within  a  scene  to  make  a  drop.  Results  from 
this  work  indicate  that  planners  could  be  greatly  aided  by  the  work  presented. 

For  airstrikes,  offset  aiming  is  a  vital  piece  of  mission  planning,  and  one  that 
should  be  incorporated  in  the  earliest  stages  of  collateral  damage  estimation.  Fur¬ 
ther,  the  cookie-cutter  damage  function  should  be  scrapped  in  favor  of  more  repre¬ 
sentative  damage  functions.  While  these  functions  may  be  more  difficult  to  visualize, 
the  software  packages  available  to  mission  planners  should  have  no  issues  with  han¬ 
dling  the  more  complicated  distributions. 

TIC  scenarios  present  the  greatest  collateral  risk  and  the  most  difficult  type 
of  risk  to  lower.  Improved  intelligence  gathering  and  an  a  priori  understanding 
of  tradeoffs  within  a  TIC  have  been  shown  to  speed  up  decision-making  in  these 
time-sensitive  engagements. 

Collateral  damage  and  civilian  deaths  continue  to  plague  U.S.  missions  world¬ 
wide  and  attempts  to  minimize  these  risks  are  at  the  forefront  of  military  leaders’ 
efforts.  This  dissertation  presents  important  improvements  in  understanding  the 
source  of  collateral  risk  and  steps  which  the  U.S.  military  can  take  to  minimize  risk 
while  still  ensuring  mission  success. 
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Appendix  A.  Proof  1 

THEOREM:  The  expected  collateral  risk  in  a  randomly-generated  infinitely-large 
scene  E[f2cc]  <  E[f2g\  <  E[f2e]  regardless  of  the  accuracy  and  lethal  range  of  the 
weapon. 

PROOF:  Expected  collateral  risk  is  independent  of  accuracy  when  randomly  aiming 
by  Formula  18.  Additionally,  by  assumption,  the  lethal  range  for  each  damage  func¬ 
tion  is  equal,  i.e.  LR  =  dcc(r)dr  =  LR  =  dg(r)dr  =  LR  =  ff0  de(r)dr. 

Each  pair-wise  set  of  damage  functions  overlaps  only  once.  For  the  cookie-cutter 

and  the  Gaussian  damage  function,  the  functions  only  intersect  at  r  =  LR  since 

r  <  LR  — >■  d^  —  1  and  r  >  LR  — »  dcc  =  0  and  0  <  dg  <  1,  when  r  <  LR  then 

dcc  >  dg  when  r  >  LR  then  dcc  <  dg.  For  the  Gaussian  and  exponential  functions, 

the  two  functions  cross  only  at  the  point  r  =  with  de  <  dg  for  r  <  anci 

de  >  dq  for  r  > 
c  y  7r 


LEMMA:  If  J0°°  d\(r)dr  =  J0°°  d2(r)dr,  d\(r)  <  d2(r)  for  r  G  [0,  x),  and  d\(r)  >  d2{r) 
for  r  G  (x,  oo)  then  J0°°  rd\{r)dr  >  /0°°  rd2(r)dr. 
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Therefore,  J0°°  rdcc(r)dr  <  /0°°  rdg(r)dr  <  /0°°  rdc(r)dr  — >  27m  J0°°  rdcc{r)dr  <  2i m  /0°°  rdg(r)dr  < 
27 rnf™rdc(r)dr,  thus  E\f2J  <  E[f2 J  <  £[/2J. 
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Appendix  B.  Proof  2 

THEOREM:  No  optimal  policy  recommends  firing  after  receiving  an  “illegitimate” 
call  when  p  >  0.5. 

PROOF: 

CASE  I:  Previous  to  the  “illegitimate”  call,  there  have  been  more  “legitimate”  calls 
than  “illegitimate”  calls. 

The  “illegitimate”  call  would  move  the  Markov  decision  process  to  a  state 
already  visited  in  the  scenario.  Due  to  the  Markov  property,  only  the  state  currently 
in  (and  not  the  path  to  that  state)  determines  the  policy  for  that  state.  If  the  optimal 
policy  at  the  new  state  is  “shoot”,  then  the  observer  would  have  already  shot  when 
at  the  state  previously. 

CASE  II:  Previous  to  the  “illegitimate”  call,  there  have  not  been  more  “legitimate” 
calls  than  “illegitimate”  calls. 

The  belief  state  (legitimate,  illegitimate)  is  ( plq+ql ,  p,p+ql )  when  there  have  been 
i  more  “illegitimate”  calls  than  “legitimate”  calls,  with  i  >  0.  With  the  likelihood  of 
a  correct  observation  p  >  0,  then  >  -X—  meaning  the  likelihood  of  the  target 
being  illegitimate  is  greater  than  the  likelihood  of  the  target  being  legitimate.  Since 
the  cost  of  leaving  a  legitimate  target  equals  the  cost  of  bring  at  a  illegitimate  target, 
then  the  cost  of  leaving  the  scenario  (pi9+qics)  is  lower  (more  optimal)  than  the  cost 
of  bring  upon  the  target  (pffpCs). 


-n  ■  -3  -2  -1  TO 
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Figure  60:  Markov  Transition  Diagram 
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